ter ‘December 11, 1912, at the post office at Urbana; Illinoi 
24, 1912. Acceptance for mailing at the special rate of post- 
ion 1103, Act of October 3, 1917, authorized July 31, 1918.) 


COLLEGE OF EDUCATION 


CRITICAL STUDY OF MEASURES OF © 
ACHIEVEMENT RELATIVE TO | 
es CAPACITY : 


By 
C. W. ODELL 


Assistant Director, Bureau of Educational Research 


PRICE 50 CENTS 


PUBLISHED BY THE UNIVERSITY OF ILLINOIS, URBANA 
1929 


y the Intern 
in 2023 


8750 1228 5584 


CHAPTER I. 


CHAPTER II. 


CHAPTER Til 
CHAPTER IV. 
CHAPTER V. 
CHAPTER VI. 
APPENDIX A. 


APPENDIX B. 


APPENDIX C. 


TABLE OF CONTENTS 


DEFINITION OF TERMS AND STATEMENT OF 
PROBLEM . , : : ? , é 


A BriEF ACCOUNT OF PROPOSED MEASURES 


MERITS AND DEMERITS OF THE VARIOUS PRO- 
POSED MEASURES 


THE VALIDITY OF MEASURES OF ACHIEVE- 
MENT RELATIVE TO CAPACITY. 


THE RELIABILITY OF MEASURES OF ACHIEVE- 
MENT RELATIVE TO CAPACITY 


SUMMARY AND CONCLUSIONS 


THE REDUCTION OF NEGATIVE CORRELATION 
BETWEEN INTELLIGENCE QUOTIENTS AND 
ACHIEVEMENT QUOTIENTS 


ESTIMATING THE RELIABILITY OF ACHIEVE- 
MENT QUOTIENTS FROM THAT OF ACHIEVE- 
MENT AND MENTAL AGES 


A COMPARISON OF THE RELIABILITY OF QUO- 
TIENT MEASURES AND DIFFERENCE MEAs- 
URES 


PAGE 


Zi 


oo 


41 


49 


St 


53 


af 


TABLE if 
TaBLe II. 
TaBLe III. 
SLAVS me Vi 
TABLE V. 
TABLE VI. 
TABLE VII. 
TABLE VIII. 


LIST OF TABLES 


COMPARISON OF DIFFERENCES FOUND BY PINTNER’S METHOD 
WITH ACHIEVEMENT QUOTIENTS 3 5 . : : 


COMPARISON OF TORGERSON’S EFFICIENCY QUOTIENTS WITH 
ACHIEVEMENT QUOTIENTS . A oer : 5 : A 2 


COMPARISON OF INDICES OF EFFORT COMPUTED BY THE Two 
METHODS SUGGESTED BY SYMONDS. . ; : ; f 


LENGTH OF SCHOOL ATTENDANCE AT VARIOUS MENTAL AGES 
FOR CHILDREN OF DIFFERENT INTELLECTUAL LEVELS . A 


MEASURES OF RELIABILITY OF AGE AND QUOTIENT SCORES 
OBTAINED BY THE WRITER . ° . . . . . 


COMPARISON OF PROBABLE ERRORS OF ACHIEVEMENT QUO- 
TIENTS COMPUTED BY THE FORMULA 


X-P.E., \ ; 
(22) + P.E. 


W 


Pars 


witH THOSE COMPUTED FROM THE QUOTIENTS THEMSELVES 


COMPARISON OF RATIOS OF PROBABLE ERRORS OF MEASURE- 
MENT TO THE MEAN AND STANDARD DEVIATION COMPUTED 
BY THE FORMULA FOR THE PROBABLE ERROR OF A DIFFER- 
ENCE WITH THOSE OBTAINED FROM THE ACTUAL ACHIEVE- 
MENT QUOTIENTS . 5 : : : 7. 


COMPARISON OF RELIABILITY OF DIFFERENCE MEASURES AND 
QuoTiENT MEASURES DERIVED FROM THE SAME TEST SCORES 


PAGE 


27 


28 


2 


39 


45 


54 


56 


58 


A CRITICAL STUDY OF MEASURES OF 
ACHIEVEMENT RELATIVE TO CAPACITY 


CHAPTER I 
DEFINITION OF TERMS AND STATEMENT OF PROBLEM 


Introduction. The use of measures of achievement relative to ca- 
pacity has been one of the most enthusiastically recommended and 
widely employed procedures that have arisen in connection with the 
standardized test movement. Almost at once after the first specific 
public suggestions as to how to compute such measures, they were 
received with popular favor by both the leaders and the rank and 
file of the educational profession. As is frequently true, so in this in- 
stance, the majority of those who adopted such procedures did so non- 
critically.t From time to time some unusually thoughtful worker called 
attention to the very serious deficiencies of the measures employed, but 
only recently has a considerable amount of attention been focused upon 
this point. It has, therefore, appeared worth while to devote this 
bulletin to a critical study of measures of achievement relative to ca- 
pacity. Before proceeding further, a number of more or less technical 
terms employed in the discussion will be defined, so that there may be 
no doubt as to their exact significance. Three of these are rather 
general; the others are names of various measures of achievement and 
intelligence. 

Achievement. The expression “achievement” is used to refer to 
the quantity and quality of school work done by pupils. Thus a 
measure of achievement is a measure of how much school work pupils 
have covered and how well they have mastered it. 

Capacity. As employed in this bulletin, “capacity” is limited to 
mental capacity or potential ability to do school work. It is most 
often determined by the use of general intelligence tests. That is, a 
pupil’s capacity is considered to be indicated by the score he makes 


upon such a test. 

General intelligence (or intelligence). As implied in the preceding 
paragraph, “general intelligence” or merely “intelligence” is used as 
synonymous with capacity to do school work. In other words, a 


1 attitude toward the use of measures of achievement 
ation from an article which appeared in 1922 may be 
hment Quotient is the fairest and most valuable meas- 
d the pupil, that by relying on it for guidance, the 
he has been given and take 
, 


1As an example of the non-critica 
relative to capacity, the following quot 
given. ‘‘We believe that the Accomplis 
ure of both the efficiency of teacher an 
teacher will come to exact from him that hath even more than : 
from him that hath not even less than he has been able to give. 


5 


6 . Butietin No. 45 


pupil’s general intelligence is thought of as his potential ability to 
learn the subject-matter presented in school. 


Classified list of measures to be defined. The measures of achieve- 
ment and intelligence to be defined may be classified as follows, first 
as absolute? or relative? and then under further subdivisions: 


I. Absolute measures* 
A. Measures of intelligence 
1. Mental age 


B. Measures of achievement 
1. Achievement age 
2. Accomplishment age 
3. Attainment age 
4. Subject age 
5. Educational age 


C. Measures of either intelligence or achievement 
1. T-score’ (or sigma score) 


II. Relative measures 


A. Measures of intelligence relative to chronological age® 
1. Quotient measures’ 
a. Intelligence quotient 
2. Sigma measures 
a. Mental index® 


2Absolute measures are those that express achievement or intelligence, as the case may 
be, directly in terms of some unit of measurement of the characteristic measured and not by 
comparison with some other characteristic. 

3Relative measures are those which compare absolute measures of one characteristic 
with those of another and express the first relative to the second. 

4No attempt will be made to give definitions of all suggested absolute measures of 
either capacity or achievement. Only those that have to date been employed in securing 
relative measures, and therefore are referred to in this bulletin, will be defined. All of the 
absolute measures defined except the T-score are age measures. It is a sigma measure; that 
is, it is expressed in terms of the sigma or standard deviation of the scores of the whole 
group. The standard deviation is that distance which, if laid off on either side from the mean 
of a normal distribution, includes 34.13 per cent of all the cases. 

5The T-score is most commonly employed as a measure of achievement, but may equally 
as well be used for intelligence. 

6A number of other relative measures in addition to the two named above have been 
proposed, but as none of them are involved in the computation of achievement relative to 
capacity, they will not be defined. The references describing more fully the relative measures 
are not given here, but may be found in Chapter II in connection with the accounts of their 
origins. 

7The term “quotient measure’? is employed here in a general sense to refer to any 
measure that involves the division of one quantity by another. It would be equally appro- 
priate to use the expression ‘‘ratio measure.” 

8The reader may wonder why the mental index, and later the educational index, are 
classed as relative measures rather than as absolute measures, since they as well as the T-score 
are sigma measures. The reason is that they are based upon distributions for each age and 
thus really show capacity or achievement, as the case may be, relative to the age of the in- 
dividual concerned, whereas the T-score is based upon a single age group and does not 
indicate capacity or achievement relative to the individual’s age. 


A Stupy oF MEASURES OF ACHIEVEMENT RELATIVE TO CAPACITY 7 


B. Measures of achievement relative to chronological age 
1. Quotient measures 
a. Subject quotient 
b. Educational quotient 
c. Accomplishment quotient (sometimes) 


2. Sigma measures 
a. Educational index® 


C. Measures of achievement relative to capacity 
1. Quotient measures 

. Achievement quotient 

. Accomplishment quotient 

. Attainment quotient 

. Achievement ratio 

. Accomplishment ratio 

. Subject ratio 

. Educational ratio 

Efficiency quotient 


a 09 mhoiaog Oo & 


2. Difference measures?® 


a. Pintner’s difference 
b. Symond’s index of studiousness or of effort 


Mental age (M.A.). Mental age is ordinarily used to refer to a 
score on a general intelligence test expressed in terms of age units. 
A mental age of a given amount is equal to the average intelligence of 
an unselected or random group of pupils of that age. For example, 
if the average score made by pupils eleven years and six months of 
age upon a certain general intelligence test is 95, that score is said 
to be equivalent to a mental age of eleven years and six months. 

Achievement age (A.A.). Achievement age is similar to mental 
age except that it represents a pupil’s score upon an achievement 
rather than upon a general intelligence test. Therefore, an achieve- 
ment age of a given amount, such as ten years and eight months, is a 
means of expressing the average score made by pupils of that chrono- 
logical age. 

Accomplishment age (A.A.). This expression is often used instead 
of achievement age. The two are entirely synonymous. 

Attainment age (A.A.). This is synonymous with achievement age 
and accomplishment age, but is rarely used. 


°See note on mental index. 


130A difference measure is one found by subtracting one measure from another. 


8 - Butitetin No. 45 


Subject age (S.A.). Subject age is somewhat synonymous with 
achievement age, but differs in that it is limited to a pupil’s age score 
in a single subject. Thus a pupil may have a subject age in arithmetic, 
a subject age in reading, and so on. 

Educational age (E.A.). Educational age is likewise largely 
synonymous with achievement age. The difference is that it is applied 
only to a pupil’s average standing in a number of school subjects, 
whereas achievement age may refer either to such an average or to his 
standing in a single subject. 


T-score (or sigma score). A T-score is one given according to 
the T-scale, which is based upon the distribution of ability of an un- 
selected group of twelve-year-old pupils. The scale consists of 100 
units of .1 standard deviation each, extending from five standard de- 
viations below average twelve-year-old ability to five standard devia- 
tions above average. T-scores, therefore, range from 0 to 100, with 
50 as average. They are not commonly used in the case of pupils 
whose ages differ from twelve by more than three or four years, but 
very similar scores may be found for groups of such pupils. All 
scores of this general sort are often called sigma scores. 


Intelligence quotient (I.Q.). The intelligence quotient is found by 
dividing an individual’s mental age by his chronological age. In for- 


M.A. 
mula form, I.Q. Eee In writing the result, it is carried to two 


places and the decimal point omitted. Thus an individual whose 
mental age is the same as his chronological age, or, in other words, 
the same as the average mental age of all persons of his chronologi- 
cal age, has an intelligence quotient of 100. If his mental age is 
greater than his chronological age, his intelligence quotient is pro- 
portionately greater, and if less, it is less. For persons in the upper 
teens or above, the actual chronological age is not employed as the 
divisor, but instead sixteen or some other fixed age supposed to repre- 
sent the point at which the growth of intelligence ceases. 


Mental index (M.I.). This measure performs the same function 
as the intelligence quotient in that it compares the intelligence of an 
individual with the average intelligence of persons of his age, but the 
method of computation is decidedly different. It is determined ac- 
cording to a scale based upon an assumption of normal distribution 
of ability of pupils of each age, and ranges from 0 through 50 as 
average or normal up to 100. Its chief difference from the T-score 
is that it is based upon the distribution of scores of the age group 


or 


A Stupy or Measures OF ACHIEVEMENT RELATIVE TO CAPACITY 9 


to which the individual belongs rather than always upon that of the 
same age group. 

Subject quotient (S.Q.). There is the same difference between the 
subject quotient and the second or less usual meaning of the achieve- 
‘ment quotient as between the subject age and the achievement age. 
That is, the subject quotient refers to results from a. single subject 
alone, whereas the achievement quotient may also apply to a com- 
bination of results from several subjects. It is found by the formula 

A. 
Some 
Cas 

Educational quotient (E.Q.). This is nearly synonymous with 

achievement quotient in its second or less common sense, since 
EX. : 

E.Q. = ea It differs in the same way as does the educational age 

from the achievement age, in that it is employed only to refer to 

combined or average results in several school subjects, whereas the 

latter may also refer to a single subject. 

Educational index’ (E.I.). The educational index is similar to the 
mental index except that it is derived from scores on achievement 
rather than intelligence tests. 

Achievement quotient (A.Q.). Two chief methods of securing an 
achievement quotient have been suggested. One of these involves 
the division of the achievement age by the mental age; that is, A.Q. = 


pita This is the more usual of the two and may be considered the 


M.A. 

standard method; therefore, unless otherwise stated, the writer will 
employ the term in this sense. The less common method of securing 
an achievement quotient is to divide achievement age by chronological 
age. It is used in this sense by those who employ achievement ratio 
instead of achievement quotient for achievement age divided by men- 
tal age. As is true of the intelligence quotient, so the achievement and 
all other quotients and ratios are regularly carried to two places and 
written without a decimal point. The expression has also been used 
by several persons with meanings varying somewhat from either of 
those just given, but these minor differences will not be dealt with 


here. 
Accomplishment quotient (A.Q.). This term is synonymous with 


achievement quotient. 


& 


{0}, 


USometimes the formula A.Q. = LO is employed. 


It yields exactly the same result 


as the one given in the text. See p. 13. 


10 | : Buiietin No. 45 


Attainment quotient (A.Q.). This is synonymous with achieve- 
ment quotient and accomplishment quotient, but very rarely employed. 


Achievement ratio (A.R.). Achievement ratio is synonymous with 


achievement quotient in its first or more frequent sense. That is, 


Nees 


M.A. 
Accomplishment ratio (A.R.). This is synonymous with achieve- 
ment ratio. 
Subject Ratio (S.R.). This is similar to the achievement ratio 
Soe ope) 
or : 
M.A. 12); 
Educational ratio (E.R.). This is similar to the achievement ratio 


except that it always refers to combined results from several 
Ee ee: 


or 


VASO 

Efficiency quotient (Eff.Q.). This is a suggested but very rarely 
used measure of achievement relative to capacity. It is secured by 
dividing a pupil’s achievement point score by the norm for his age and 
then further dividing this result by his intelligence quotient. 

Pintner’s difference. This is a measure little used except by 
Pintner and those associated with him. It is found by subtracting 
an individual’s mental index from his educational index. Thus a 
positive difference indicates that a pupil is doing better work than is 
done by average pupils of his capacity, a difference of zero that he 
is doing work of the same quality, and a negative difference that he 
is doing work of an inferior quality. 


except that it is limited to a single subject. S.R.= 


school subjects. In other words, E.R. = 


Index of studiousness or of effort. This is a rather general term 
proposed by Symonds to include various more or less similar methods 
of comparing achievement with capacity. These methods are especial- 
ly intended for use in high school, where the difficulties connected with 
employing the various quotient and ratio measures are serious. Two 
possible methods of computing it are suggested. One of them consists 
merely in ranking pupils according to their achievement and also ac- 
cording to their capacity, and taking the differences. The second is 
somewhat more difficult. It requires that both achievement and gen- 
eral intelligence test scores be turned into standard deviation units. 
The difference between the two standard deviation scores of each 


pupil is then found, multiplied by ten, and added algebraically to 
fifty.¥? 


“The multiplication by ten is merely to avoid fractional indices, and the addition to 
fifty to eliminate negative ones. 


A Stupy or MEAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 11 


Purpose and plan of this study. It has already been stated that 
the general purpose of this bulletin is to present a critical study of 
measures of achievement relative to capacity. As preliminary to this 
study, a brief historical account of the origin of various proposed 
measures of the sort mentioned will be given. This will be found in 
Chapter II. Following that, an attempt will be made to answer three 
chief questions, as follows: 

1. What are the merits and demerits of the various proposed 

measures? 

2. How valid** are such measures? 

3. How reliablet* are such measures? 

Chapter III will deal with the first of these questions, Chapter IV 
with the second, and Chapter V with the third. As part of his attempt 
to give answers, the writer will refer to practically all published critical 
discussions pertaining to the topic and will also present some hitherto 
unpublished data which he has gathered. Chapter VI will contain a 
brief summary and the general conclusions. Certain supplementary 
data and discussions will be given in Appendices A, B, and C. 


i 1 idity if i i hich it is 
13A test or measure is valid or possesses validity if it fulfills the function w 
intended or stated to perform. It may lack validity either because it is unreliable or because 
it measures some other ability or abilities than the statement of its function specifies. Since 
few, if any, tests possess perfect validity, the term 1s used relatively, and tests are said to 
be valid when they approach perfect validity. See ¢ aE 1 
4 i liable, or has reliability, if a second application of the test 
144A test or measure is reliable, ; Bee lig with 
ach pupil makes exactly the 
. In other words, a test 1s perfectly reliable not only if eac ac 
oe: score the second ‘time as the first, but also if there is a constant and known difference, 
either an amount or a ratio, between the first set of scores and the second. . For example, 
if each pupil’s score were four points greater 
increased by 10 per cent of his original score, ) 
even the best standardized tests are not perfectly reliable, 
to refer to those which approach perfect reliability. 


CHAPTER II 
A BRIEF ACCOUNT OF PROPOSED MEASURES 


The educational and subject quotients. Apparently the first quo- 
‘tient or ratio scores to receive use in connection with standardized 
tests were the educational and subject quotients. These, along with 
the educational and subject ages, were suggested by McCall and em- 
ployed by him and his students before 1920. It appears, however, 
that these measures were not mentioned in print until later than the 
achievement age and the achievement quotient. As may be seen by re- 
ferring to the definitions of the educational and subject quotients given 
in Chapter I, they are measures similar to the previously well accepted 
intelligence quotient? in that they compare pupil performance with 
chronological age. 

The achievement quotient. There is some doubt as to who should 
receive most credit in connection with the idea of comparing achieve- 
ment test scores with those on intelligence tests by the quotient or 
ratio method. The use of standardized achievement tests, which be- 
gan in 1908, became fairly common within a few years, so that by 
1915 they were being used in many school surveys and by many ad- 
ministrators, supervisors, and teachers. At first, practically all per- 
sons who interpreted the results of such tests appear to have done so 
without regard to the capacity of the pupils with whom they were 
dealing. Occasionally someone more thoughtful than most made an 
attempt to interpret achievement in the light of pupils’ intelligence, 
but it was not until about 1920 that this procedure began to be dis- 
cussed in print, in public addresses, or elsewhere. 

Among the earliest workers to point out the esirapit) of com- 
paring pupils’ achievement with their intelligence, or capacity to 
achieve, were the Presseys. In an address at the seventh Indiana 
University Conference on Educational Measurements in April, 1920, 
Mrs, Pressey devoted considerable attention to this point.2 Indeed, 
she even made use of the expression “quotient of achievement” in 
connection with the comparison of achievement and capacity. She 
did not, however, employ this phrase in the same delimited sense in 
which “achievement quotient” soon came to be used, but in a much 

1For the origin of the intelligence quotient, see: 

Freeman, F. N. Mental Tests. Boston: Houghton Mifflin Company, 1926, p. 98. 

2Pressey, L. C. “The Relation of Intelligence to Achievement in the Second Grade,” 


Bulletin of the Extension Division, Vol. 6, No. 1. Bloomington, Indiana: Indiana University, 
September, 1920, p. 68-77. 


2 


A Strupy oF MEAsuRES oF ACHIEVEMENT RELATIVE TO CAPACITY 13 


more general way to refer to any measure which served the purpose 
indicated. Furthermore, although the Presseys continued to empha- 
size the importance of making such comparisons, they did not press 
the use of this or any similar term. 

Franzen thought of employing a quotient or ratio score at least 
as early as 1919 when he was a student under McCall and appears 
to deserve credit for originating the idea. However, he did not pub- 
lish it immediately, and the first proposal of the achievement quotient 
as now employed to appear in print seems to have been in connection 
with the Illinois Examination published by the Bureau of Educa- 
tional Research of the University of Illinois, in 1920.° In response to 
a demand for a battery of tests which could be employed for general 
survey purposes, the Monroe Standardized Silent Reading Tests, Re- 
vised, the Monroe General Survey Scale in Arithmetic, and an in- 
telligence test later called the Illinois General Intelligence Scale, were 
combined. In connection with this battery, the already rather com- 
mon practice of transmuting scores made upon an intelligence scale 
into mental ages was followed. In addition, Monroe suggested that 
scores upon the reading and arithmetic tests be turned into achieve- 
ment ages. He also provided for the computation of achievement 
quotients‘—that is, achievement ages divided by mental ages—as 
measures of achievement compared with capacity to achieve. 

The accomplishment quotient. As has already been stated, 
Franzen had previously conceived the idea of a measure similar to 
that suggested by Monroe, but did not publish it until later. This 
measure, which he termed the “accomplishment quotient,” appears to 
have been first mentioned in print in the fall of 1920,° a few months 
after Monroe’s proposal had appeared. He suggested a different 
way of computing it which, however, differs from Monroe’s only in 


method and not in result. Franzen’s method is to divide the educa- 
E.A. 


tional quotfent by the intelligence quotient. Since E.Q. = CA and 
E.Q. 

sep ee it can readily be seen that the formula A. wee 
CEA, LO: 


3Monroe, W. S. and Buckingham, B. R. “Illinois Examination: Teacher’s Handbook.” 

Urbana: University of Illinois, Bureau of Educational Research, 1920. Siep: d 

Monroe, W. S. and Buckingham, B. R.. “The Illinois Examination I and II: Teacher’s 
Handbook.” Bloomington, Illinois: Public School Publishing Company, 1920. 32 p. 

Monroe, W. S. “The Illinois Examination,” University of Illinois Bulletin, Vol. 19, 

No. 9, Bureau of Educational Research Bulletin No. 6. Urbana: University of Illinois, 1921. 

4At sthe time Monroe conceived the idea of the achievement quotient, he had not yet 


’s simil dea. é : 
seen su The Accomplishment Quotient of School Marks in Terms of Indi- 


vidual Capacity,” Teachers College Record, 21:432-40, November, 1920. 


14 Butitetin No. 45 


easily reduces to A.Q. = pes which is synonymous with sd 
d cates ais M.A. 


The accomplishment ratio. Although the use of the achievement 
or accomplishment quotient at once began to be very common, Franzen 
appears to have been disSatisfied with the latter term. Less than two 
years later he began to advocate the use of a different expression, “ac- 
complishment ratio,’ for the same idea.’ At this time he employed 
the term “quotient” to refer to a measure secured when the divisor is 
chronological age and “ratio” to refer to one secured when the 
divisor is mental age. Moreover, he practically dropped accomplish- 
ment quotient in favor of subject quotient and educational quotient.® 
Furthermore, he employed accomplishment ratio in a more limited 
sense than he had previously used accomplishment quotient, limiting it 
to educational age divided by mental age, and employing the expres- 
sion “subject ratio” for subject age divided by mental age. Although 
many workers have followed Franzen in his use of accomplishment 
ratio instead of accomplishment quotient, the tendency has been not 
to limit it to educational age divided by mental age, but rather to use 
it for either that or subject age divided by mental age. In other words, 
when used it has generally been entirely synonymous with achievement 
quotient as suggested by Monroe and with Franzen’s first use of ac- 
complishment quotient. 

Although it was undoubtedly desirable that distinct terms be em- 
ployed to refer to the relation of achievement to chronological age 
and to mental age or capacity, and although there appears to have 
been no logical objection to the terms and distinctions suggested by 
Franzen, they did not come into general use. Monroe’s achievement 
quotient and Franzen’s similar accomplishment quotient, both ab- 
breviated A.Q., had become broadly enough known and used by the 
date of Franzen’s later suggestions that the latter did not cause a gen- 
eral change of practice in respect to the matter. On the ®ther hand, 
quite a number of persons followed Franzen in his later suggestions. 
Thus, although his first proposal was more generally accepted than the 
second, confusion was introduced in that accomplishment quotient was 
sometimes used in the one sense and sometimes in the other. To com- 
plicate the situation still more, it even occasionally happened that 


SAll that is necessary to make this reduction is to cancel the common denominator C.A. 
‘Franzen, R. H. ‘The Accomplishment Ratio,’ Teachers College, Columbia University 
Contributions to Education, No. 125. New York: Bureau of Publications, Columbia Uni- 
versity, 1922. ; 
Franzen, R. H. “The Conservation of Talent,” Terman, L. M., et al. Intelli 
and School Reorganization. Yonkers: World Book Company, 1922, Chapter IV 
SAlthough at this time (1922) Franzen dropped the term “accomplishment quotient,’ he 
later employed it as synonymous with educational quotient or subject quotient. : 


gence Tests 


——— eer 


A Srupy or MEAsurRES OF ACHIEVEMENT RELATIVE TO CAPACITY 15 


those who used achievement or accomplishment quotient in the sense 
suggested by Monroe, and by Franzen in 1920, employed accomplish- 
ment ratio for accomplishment age divided by chronological age. In 
other words, they used A.Q. and A.R. in just the reverse senses from 
those which Franzen advocated in 1922. In the meantime the educa- 
tional and subject quotients continued to receive more use, but, how- 
ever, with no difference of opinion or practice as to their meaning, 
both being used to refer to achievement as related to actual age; that 
BoA: 


is, to ———~ and ——, respectively. 
a ewer 


Pintner’s difference method. Before the quotient or ratio technique 
had become thoroughly established, another method of comparing 
achievement with capacity was suggested. In connection with his 
general survey and mental tests, Pintner proposed the use of a mental 
index, an educational index, and finally the difference between the 
two.2 This appears to have been the first proposal for the use of a 
difference between achievement and intelligence test scores rather than 
the use of a quotient or ratio between them as a measure of achieve- 
ment relative to capacity. Probably because the concept of mental 
age and the accompanying intelligence quotient were widely under- 
stood and firmly established in educational usage, and because of the 
similarity thereto of age and quotient or ratio measures of school 
achievement as contrasted with Pintner’s indices and difference, the 
latter were never received into the same popular favor as the former. 
Few persons except Pintner and his students and co-workers made 
use of them and they were very rarely if ever employed in connection 
with other tests than his. 

The efficiency quotient. Another suggested method of comparing 
achievement with capacity was that of Torgerson, who called his 
measure the “efficiency quotient.”*° As preliminary to securing its 
he defined the achievement quotient in a new way, somewhat synony- 
mous with the meaning of the subject or educational quotient. Instead 
of the common method of transmuting a pupil’s achievement score 
into a subject age and then dividing by the chronological age, he sug- 
gested that the achievement score be divided by the norm or average 
score for the pupil’s age, using both in terms of point scores.” ‘The 


°Pintner, Rudolf and Marshall, Helen. ‘‘A Combined Mental-Educational Survey,” 


1 Educational Psychology, 12:32-48, January, 1921. 
CC Aig ine "Rudolf. “utanaal of Directions for the Non-Language Mental and Educa- 


i Tests.”? Columbus, Ohio: College Book Store, 1920. 16 p. ; 
eee son, T. L. “The Efficiency Quotient as a Measure of Achievement,” Journal 
of Educational Research, 6:25-32, June, 1922. , 


11A point score 1S the score yielded directly by a test 1 


rectly, level of difficulty reached, or otherwise. 


n terms of exercises done cor- 


16 ¢ Butietin No. 45 


result, which he called the achievement quotient, is divided by the in- 
telligence quotient to yield the efficiency quotient. Thus, in a general 
way, the latter has a similar significance to Monroe’s achievement 
quotient and Franzen’s 1920 accomplishment quotient, in that it repre- 
sents a comparison of achievement with capacity. Torgerson justified 
his method by stating that a grade norm represented the performance 
of the average pupil, or in other words, was based upon an I.Q. of 


100. 


High-school and college accomplishment quotients. a. Peters’ 
proposal. From time to time someone has pointed out that none of 
the commonly employed methods of computing quotients or ratios is 
adequate in the case of school subjects for which achievement ages 
can not be satisfactorily determined. This is a condition that holds 
for practically, if not absolutely, all high-school and college subjects. 
Achievement therein depends to a relatively small degree upon age, 
and much less upon capacity regardless of time spent upon the sub- 
ject than in elementary school. Also. differences in high-school curric- 
ula are so great that pupils of decidedly varying ages and mental 
abilities may be pursuing the same portion of the same subject. A 
suggestion for taking care of the situation has been offered by Peters,’ 
who gave an empirical formula developed from a study of the marks 
of a considerable number of students at Ohio Wesleyan University. 
At first he tried a method that may be described as follows: The 
academic marks of students and also their intelligence scores were 
transmuted into T-scores'*—that is, into standard deviation units— 
and then accomplishment quotients secured by dividing academic T- 
scores by intelligence T-scores. As this method was applied, however, 
it was found to have a serious defect in that persons ranking highest 
or lowest in intelligence could not by any possible means secure ac- 
complishment quotients higher or lower, respectively, than 1.00, 
since the highest academic T-score just equalled the highest intelli- 
gence T-score and similarly for the lowest of each. The same in- 
justice was involved in the case of all students to the extent to which 
they stood near either end of the distribution of intelligence scores. 

To correct this injustice, the empirical formula already referred 
to was developed. It was based upon two considerations: first, that 
School and College Levels,” Journal of Beucaoeal Roce Taare aren ae 


For a fuller explanation and discussion of T-scores than is given on p. 8, see: 
McCall, W. A. How to Measure in Education. New York: The Macmfen Company 
1922, p. 272 f., or: ; 
Monroe, W. S. An Introduction to the Theory of Educational M ( t : 
Houghton Mifflin Company, 1923, p. 150 f. o® peas eS 
4Peters does not follow the usual practice of omitting the decimal point in accomplish- 
ment quotients, but retains it. Therefore, his quotient of 1.00 is equivalent to a quotient of 
100 in ordinary usage, of .85 to 85, and so on 


A Stupy or MEAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 17 


the correction should be an addition for students above average and 
a deduction for those below average sufficient to offset the amount 
which each is above or below the mean; second, that since students of 
high intelligence who exceed average accomplishment may be ex- 
pected to do as much better than the average, and likewise those of 
inferior intelligence who fall below it to do as much worse, as students 
of normal or average intelligence, the addition or subtraction need only 
be made to the extent to which the normal chances have been ex- 
hausted. The average deviation from the median for the middle 
-quintile’® was found to be .130;!° therefore, those of highest intelli- 
gence should have a chance to earn .13o on the average above normal, 
and those of lowest intelligence should on the average fall that much 
below, in their achievement quotients. The midpoint of the distribu- 
tion should be 5, since the zero point of a T-scale or other scale based 
on standard deviation units is commonly taken at —5o and the upper 
end of the scale at + 5z, so that its total range is 100 and, of course, 
the midpoint 5o above the zero point. In view of these considerations, 
the formula which Peters suggested to satisfy the conditions was as 
follows, using somewhat different terminology from that which he 
employed: 
ALO. = a ae Sa , 
I (5+4—1) 64,4—A) 


In this formula, A equals the academic mark expressed in terms of 
T-scores or standard deviation units, and I the intelligence test score 
similarly expressed. Wherever the plus or minus sign occurs in the 
formula, plus is to be used if the student’s intelligence score is greater 
‘than 5o and minus if it is less than 50. Peters suggested further that 
the formula not be employed unless both intelligence and academic 
accomplishment are above 60 or below 4o, since between these limits 
the correction is not more than .01 and therefore not worth the trouble 
of computing and applying. 

b. Otis’ suggestion. In his article, Peters also stated certain ob- 
jections to his formula advanced by Otis and likewise a substitute 
proposed by the latter. This substitute is that the academic mark 
expressed in terms of the standard deviation be divided by the intelli- 
gence score in the same terms after the latter has been multiplied by 
the coefficient of correlation between the two, adjusted until the cor- 

Quintile is synonymous with fifth. Thus the middle quintile of any group includes 


the fifth of the individuals above and below which there are two-fifths when all are arranged 


in order according to the trait or measurement in question. en 
od 1%6The small Reddo letter sigma, o, is the commonly used abbreviation for the standard 


deviation. 


18 Buiietin No. 45 


relation between I.Q.’s and A.Q.’s is made zero. Otis’ reasoning is 
stated as follows: 


“The line of the means of the academic point-averages regresses from the 
line of relation so that at any point the latter falls only r,, times as far above 
or below the mean as the former. For any given position in the intelligence 
series we can find the mean academic achievement that corresponds to it by 
multiplying the intelligence quotient by r,,. Since this is the mean A, academic 
attainment, made by those with this J, we may take it as the ‘normal’ or ‘stand- 
ard’ A for this J. The accomplishment quotient would then be the attained 4 
divided by the derived one, that is, 


Tar 
where each is measured from the mean. Starting from —5o as a zero point, 


_ S+A 


21T 


Symonds’ index of studiousness or of effort. Another proposal for 
comparing the achievements with the capacities of secondary-school 
pupils has been made by Symonds.'* He accepted the achievement 
ratio as satisfactorily filling the need for such a measure in the ele- 
mentary school, but showed, as have others also, that decidedly serious 
difficulties were connected with its use in high school. Therefore he 
proposed the use of a measure which he called the “index of studious- 
ness” or “index of effort.” As may be seen by looking at its definition 
in Chapter I,*° Symonds used these terms in a general way and not as 
being limited to a single method of computing such a measure. In 
fact, he gave two possible methods of doing so, and implied that others 
as well might be used. Though differing in detail, both of his methods 
are based upon differences rather than quotients. He prefers the 
second, the one based upon standard deviation scores, to the first, even 


though it is more difficult. In his discussion, however, he admitted © 


that both possess certain common weaknesses. Among these are 
that they show a regression effect and that they do not permit com- 
parison between members of different classes, but only between those 
of a single class. 


Nygaard’s method of computing accomplishment quotients. 
Another proposal of a different method of computing the accom- 
plishment quotient has been made by Nygaard.?° His suggestion, how- 
ever, was not intended to take care of the situation in high schools 
or other places where age norms are not available, but rather to 


“Peters, C. C. “A Method for Computing Accomplishment Quotients or the High-Sch 
and College Levels,” Journal of Educational Research, vate September, 1926. ee 


Symonds, P. M. Measurement in Secondary Education. N York: i 
Company, 1927, p. 521-25. y ew York: The Macmillan 


12See p. 10 


2Nygaard, 12, ii, GIN Revised Accomplishment Quotient,’ Jo 1 1 = 
By eS p Quotie urnal of Educational Re 


— 


A Srupy or MEASuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 19 


remedy what he considered a defect in the usual method of comput- 
ing A.Q.’s. It has been shown by various studies that it is almost 
impossible for a child with a high I.Q. to earn a high A.Q. and like- 
wise that a child with a low I.Q. rarely has a low A.Q. Nygaard 
proposed to correct this condition. He defined accomplishment quo- 
tient as the educational or achievement age divided by the predicted 
educational or achievement age instead of by the mental age. In other 


AA. : 
words, A.Q. = . The predicted A.A. is to be determined 


Predicted A.A. 
from mental age by the ordinary regression equation”? as follows: 


ee EEA Steep (MAS Mine) ate Mica 


OM.A. 


That is, the predicted achievement age is found by adding to the 
mean achievement age (M, 4) the product of the coefficient of cor- 
relation? (r) of achievement age with mental age times the standard 
deviation of the first (o,,) over that of the second (o,,,) times 
the difference between mental age (M.A.) and mean mental age 
(My). 

Nygaard stated that by his method the average A.Q. for any group 
will be 100 irrespective of how the group ranks in achievement, the 
negative correlation between A.Q. and I.Q. will be eliminated, allow- 
ance will be made for the common fact that a group’s achievement 
age tends to have a smaller range of variability than does its mental 
age, and any handicaps common to a group because of inferior instruc- 
tion or other causes will not operate to cause lower A.Q.’s. 

Rand’s suggestion. Following rather severe criticisms of the sta- 
tistical validity of measures of achievement relative to capacity, Miss 
Rand2* has suggested what she calls a program of reconstruction of 
such measures. Since, however, her suggestions have to do entirely with 
the matter of statistical validity, they will not be considered at this 
point, but rather in Chapter IV, which deals with that topic. 

Summary. A year or two previous to 1920, McCall and his stu- 
dents began to employ the educational and subject quotients. The 
achievement quotient was first mentioned in print by Monroe in 1920, 
although Franzen at least a year earlier had conceived the same idea 


ear or first-degree equation showing the relationship 
f paired measures so that if one of a pair is known, the best possible 


2The coefficient of correlation, abbreviated r, is a measure of the rectilinear or straight- 
i It ranges in value from 


00, which indicates. that there 
noting perfect negative or inverse relationship. “ 
23Rand, Gertrude. “A Discussion of the Quotient Method of Specifying Test Results, 


20 . BuLietin No. 45 


and employed orally the synonymous term “accomplishment quotient.” 
In 1922, Franzen suggested “accomplishment ratio” instead of “ac- 
complishment quotient,’ employing the latter for another purpose. 
About the same time Pintner suggested his difference method, but this 
did not meet with a great degree of acceptance. The same is true of 
the “efficiency quotient” suggested by Torgerson. Peters and Otis 
have proposed methods for computing high-school and college accom- 
plishment quotients, but neither appears to have received any use. 
In 1927, Symonds suggested an “index of effort,” which may be com- 
puted in any one of several ways. Still more recently, Nygaard has 
suggested a change in the method of computing accomplishment quo- 
tients. Present practice may be summarized by saying that the achieve- 
ment quotient suggested by Monroe, and its synonym, the original ac- 
complishment quotient of Franzen, are widely used, the subject quo- 
tient, educational quotient, and the accomplishment ratio somewhat 
less so, and the others rarely or not at all. 


CHAPTER III 


MERITS AND DEMERITS OF THE VARIOUS PROPOSED 
MEASURES 


Problem of this chapter. As was stated on page 11, the problem 
of this chapter is to present the merits and demerits of the various 
proposed measures of achievement relative to capacity. This general 
problem, however, may be broken up into a number of parts, which 
deal with the following questions: 

1. What is the most desirable terminology in cases where two 
or more terms with identical meanings have been proposed? 

2. What are the advantages and disadvantages of comparing 
achievement with chronological age as contrasted with comparing 
it with mental capacity? 

3. What significance should be attached to an achievement 
quotient or similar measure of 100? 

4. What are the advantages and disadvantages of quotient 
measures as contrasted with difference measures? 

5. What are the merits and demerits of each of the following 
suggested measures? 

a. Torgerson’s efficiency quotient 

b. Peters’ accomplishment quotient 

c. Otis’ accomplishment quotient 

d. Symonds’ index of studiousness or of effort 

e. Nygaard’s accomplishment quotient 
Before proceeding to the consideration of these questions, one im- 

portant limitation of the discussion in this chapter should be noted. 
Since Chapter IV is devoted to the question of statistical validity and 
Chapter V to that of reliability, these topics will not be included in 
the general discussion of this chapter, although some reference will 
be made to reliability in answering the second and fourth questions 
stated above. On the whole, however, the merits and demerits of 
various proposed measures will be dealt with without regard to their 
validity and reliability. 

Terminology. As was shown in Chapter I, there has been an un- 
necessary and confusing multiplication of terms and also consider- 
able use of the same terms with different meanings. Because the term 
has become much more widely used than “ratio” im-ecom=- 


“quotient” 
the writer recommends that it alone 


paring achievement with capacity, 


21 


22 ; Butrtetin No. 45 


be employed and the latter dropped entirely. Furthermore, the ex- 
pression “attainment quotient” has been so very rarely used that it 
also may be eliminated for the sake of reducing unnecessary dupli- 
cation. This leaves “achievement quotient” and “accomplishment 
quotient” to be applied to the achievement or accomplishment age di- 
vided by the mental age, and the writer urges that these terms, and 
these only, be used with this meaning. Theoretically, it might be 
still more desirable to employ just one of them and drop the other, 
but as both have come into such general use, and as they are very 
similar, it seems well to retain and approve both. With regard to the 
variations in the usual achievement quotient proposed by Peters, Otis, 
Nygaard and perhaps others, the writer recommends that these 
measures be referred to as Peters’ accomplishment quotient, Otis’ ac- 
complishment quotient, Nygaard’s accomplishment quotient, and so 
forth. For the result obtained by dividing by chronological instead 
of mental age, the expressions “subject quotient” and “educational 
quotient” may well be retained, the former to refer to a quotient based 
upon performance in a single school subject and the latter to one 
based upon average performance in several. 

The question of differences in terminology has not arisen in con- 
nection with difference measures except that Symonds has suggested 
that either “index of studiousness” or “index of effort” be used for 
the measures which he suggests. It does not seem to the writer that 
there is any weighty reason why either one of these terms should be 
preferred to the other, but it does seem desirable that if this measure 
be employed, a single and universally used name be given it. He ven- 
tures to suggest, therefore, that the second term, “index of effort,” 
be the preferred one. This recommendation is made for at least two 
reasons. The word “effort” is probably somewhat more commonly 
used and understood than “studiousness” and also is shorter. 


Mental age versus chronological age as the standard of com- 
parison. The comparison of achievement with capacity to achieve as 
expressed in terms of general intelligence or mental age is in most 
cases much more significant than its comparison with chronological 
age or the mere number of years an individual has lived. In other 
words, as Sherrod’ and others have pointed out, the achievement 
quotient is ordinarily a decidedly more significant measure of relative 
achievement than is the subject or educational quotient. Within the 
last few years, a considerable mass of evidence has been accumulated 


1Sherrod, C. C. ‘The Development of the Id i i i y 
Journal of Education, 1:44-49, July. 1923. : Beers gee) Ue OLY, pA aregs 


A Stupy or MEaAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 23 


which indicates that what is called general intelligence is a consider- 
ably more potent factor in determining pupil achievement than is chron- 
ological age or even the number of years spent in school. That 1s, 
an individual’s mental ability contributes more largely to his perform- 
ance in the school subjects than does the length of time for which 
he has been exposed to the chance to learn.? Therefore, for practi- 
cally all purposes for which relative measures of achievement are em- 
ployed, it is more helpful to compare achievement with intelligence. 
It is ordinarily perfectly legitimate to expect a pupil to do school work 
corresponding to his intelligence, but not to his chronological age 
unless he happens to possess average intelligence for that age. More- 
over, for these reasons, it is fair to rate teachers according to how well 
they capitalize their pupils’ mental abilities, but not according to how 
great achievements they secure from pupils of a given chronological 
age. 

Ruch,? among others, has suggested that since the correlation 
between achievement age and mental age in a single grade is prac- 
tically always positive, whereas that between the achievement age and 
chronological age in a single grade is negative, the former compari- 
son is better. This reason alone does not seem sufficiently strong to 


justify the use of rather than , but is evidence for the fact 


stated above, that achievement depends upon mental ability rather 


than mrere age. 

An argument sometimes advanced in favor of the use of the educa- 
tional quotient and subject quotient instead of the achievement quo- 
tient is that the former correlate rather highly with the intelligence 
quotient, whereas the latter practically always exhibits a negative cor- 


ee 


relation therewith.t Beeson and Tope,” for example, cite data which 
show a correlation of .90 between the E.Q. and the I.Q., whereas that 


rue in general, although there may be exceptions to it 
in individual cases. No one doubts, for example, that a child of relatively low intelligence 
who has been in school for several years, will do better in the school subjects than one of 
high intelligence who has never, either at school, at home, or elsewhere, had the opportunity 
of reading, spelling, working with numbers, and so forth. Such extreme cases, however, are 
almost non-existent, as are also those even very nearly approaching them. _ , 
Evidence to support the statement made in the text may be found in the following 


references: : ° ; 
Heilman, J. D. “The Relative Influence upon Educational Achievement of Some Heredi- 


tary and Environmental Factors,” wenty-Seventh Yearbook of the National Society for the 
Study of Education, Part Il. Bloomington, Tllinois: Public School Publishing Company, 
1928, p. 35-65. 

Death, K. M. “The Effect of Length of School Attendance upon Mental and Educa- 
tional Ages,” Twenty-Seventh Yearbook of the National Society for the Study of Education, 
Part II. Bloomington, Illinois: Pubiic School Publishing Company, 1928, p. 67-91. 

3Ruch, G. M. _ ‘The Achievement Quotient Technique,” Journal of Educational Psy- 
chology, 14:334-43, September, 1923. _ ‘ 

45ce Appendix A for a brief discussion 0 


d. ; : 
me 5Beeson, M. F. and Tope, R. E. “The Educational and Accomplishment Quotients as 
an Aid in the Classification of Pupils,” Journal of Educational Research, 9:281-92, April, 1924. 


2The statement made above is t 


f how such negative correlation may be re- 


24 Butietin No. 45 


between the A.Q. and the I.Q. for the same cases is —.46. Many 
other writers, among whom may be named Murdoch,® MacPhail,’ 
Toops and Symonds,® Popenoe,® Wilson,*® and Franzen" likewise re- 
port data showing negative correlations of considerable size between 
achievement quotients and intelligence quotients. It is not apparent 
to the writer, however, why this fact should be an argument against 
employing the achievement quotient. The usefulness of the A.Q. 
does not appear to depend at all on whether it correlates positively 
or negatively with the I.Q. Indeed, the very fact that a negative cor- 
relation is practically always found has called to our attention a sig- 
nificant condition which needs remedial attention. It is true that some 
educators and others had realized that most instruction in our schools 
was better adapted to average and dull pupils than to bright ones, 
but the finding of many negative correlations of the sort just referred 
to has brought the fact home in such a pointed fashion as to arouse 
a much keener and more general realization of the need. 

Another argument occasionally urged against the use of mental 
age in preference to chronological age as the divisor is that the result- 
ing quotient is more unreliable. This is, of course, true. The A.Q. is 
based upon two unreliable quantities, A.A. and M.A., whereas the 
S.Q. or E.Q. is based upon only one, since chronological age can ordi- 
narily be determined with a high degree of accuracy. This argu- 
ment does not seem, however, to possess any considerable validity. 
The mere fact that one measure is more reliable than another does 
not justify its use when it possesses little significance. If the same 
argument were carried further, it would do away with the use of 
achievement age in the numerator, since it also is unreliable, and 
would require that some measure which can be determined with per- 
fect or almost perfect reliability be substituted. Since we possess 
no, or practically no, such measure of school achievement, measure- 
ment activity along that line would have to cease. 


Significance of an A.Q. of 100. In his discussion of the accom- 
plishment ratio, Franzen’? appeared to regard an accomplishment 


®Murdoch, Katharine. “The Accomplishment Quotient—Findi i id - 
ers College Record, 23:229-39, May, 1922. es nine 200 USO Eien ae 


“MacPhail, A. H. “The Correlation Between the I.O. d the A.Q.,” 

Society, 16:586-88, November 18, 1922. edpinice a ee 
“oops, H. A. and Symonds, P. M. ‘What Shall We Expect of the A.O.?” Jo 
Educational Psychology, 13:513-28, December, 1922; 14: 27-38, ae ee ia Aen! 

’Popenoe, Herbert. “A Report of Certain Significant Deficiencies of the Accomplishment 
Quotient,” Journal of Educational Research, 16:40-47, June, 1927. 


toWilson, W. R. “The Misleading Accomplishmen i i 4 
eet ip January, 1028" g plishment Quotient,’? Journal of Educational 
“Franzen, Raymond. “The Accomplishment Ratio, 
versity Contributions to Education, No. 125. 
OU EER eae University, 1922. 59 p. 
id. 


” Teachers College, Columbia Uni- 
New York: Bureau of Publications, Teachers 


' 
A Stupy or MEAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 25 


ratio, or, as more eommonly expressed, an achievement quotient, of 
100 as the maximum if errors were eliminated. He argued that if 


: Aer: 
the E.Q., which equals Coa is ever greater than the I.Q., which equals 


M.A. 
CAS 


fore, A.Q.’s secured by the formula A.Q. = Tow not exceed 100 


it is probably because of spurious differences** and that, there- 


except for the presence of accidental errors. As a number of writers, 
among whom are. Toops and Symonds,’* Symonds,’® and Foran," 
have pointed out, this concept is entirely untenable. There is no 
theoretical limit above which the A.Q. may not rise, although in actual 
practice it is rarely greater than 200, and in only a small per cent of 
cases exceeds 150.17 Since an age norm, whether mental or achieve- 
ment, of a given amount is the average performance of an unselected 
group of pupils of that age, it necessarily follows that the average 
A.Q. of an unselected group must be 100. For Franzen’s concept to 
be valid it would be necessary to assume that no pupil’s achievement 
could rise above the average of pupils of his mental age. If none 
could rise above the average it would follow that none could fall 
below it, and therefore that all would be just the same. However, we 
know that because some pupils study harder and by better methods 
than others, are more interested in their subjects, receive more out- 
side help, and so on, they do better work than average pupils of their 
intelligence, and that the reverse of these causes is the reason why 
others do worse work than the average. Franzen’s idea of some 
measure which would show how well pupils are achieving in com- 
parison with the best they can do rather than with what average 
pupils actually do achieve, possesses some value, but this comparison 
cannot be made by means of any of the achievement quotients. Some 
of the proposed difference methods can be adapted to serve this pur- 
pose. 
An argument sometimes advanced against the use of the achieve- 
ment quotient or any similar measure with a mean value of 100 is 


The differences are spurious because they are due to the unreliability of the test or 


in th : 
- pee a A. ee monde P. M. “What Shall We Expect of the A.Q.?” Journal 


1 chology, 13:513-28, December, 1922; 14:27-38, January, 1923. E 
e Be ae B. M. Or eucirement in Secondary Education. New York: The Macmillan 
30. : 
Company, 1927, P 3ae he Meaning and Limitations of Scores, Norms, and Standards in 
Educational Measurement,’’ Catholic University of America Educational Research Bulletin, 
Vol. 3, No. 2. Washington: Catholic Education Press, February, 1928, p. 16-19, 23-26. 
MOF some 30,000 achievement quotients, based on a number of different achievement 
and intelligence tests, compiled by the writer, only about one-half of one per cent exceeded 


200, and 5 per cent were above 150. 


26 Butietin No. 45 


that many pupils, parents, and others falsely assume that an A.Q. 
of 100 is satisfactory. Because of long familiarity with the percentile 
marking system, in which, of course, a mark of 100 is perfect, they 
tend to transfer this meaning to other measures. There is no doubt 
that some such transfer does take place. This fact, however, does not 
seem to be at all a valid argument for dropping or changing the ordi- 
nary achievement quotient. The answer to it is rather that all those 
concerned should be educated to understand that an A.Q. of 100 repre- 
sents only average or mediocre performance, and that it is not very 
difficult to bring about this understanding if well-planned effort to do 
so is put forth. 

Quotient measures versus difference measures. The general judg- 
ment of those connected with the educational measurement movement 
as to the relative merit of differences and quotients is shown by the 
fact that Pintner’s, the best-known and most strongly advocated, dif- 
ference method’ is rarely employed, whereas quotients are receiving 
widespread use. The chief reason appears to be the comparative 
ease of understanding achievement and mental ages and the quotients 
computed therefrom as contrasted with educational and mental indices 
and their differences. It requires considerable familiarity with test 
scores for an uninitiated person to form a concrete idea of what an 
educational index of 68, for example, or a mental index of 53, means, 
but it is relatively easy to understand the meaning of a mental age of 
twelve years and six months or of an achievement age of ten years 
and eight months. On the other hand, there appear to be no con- 
vincing reasons why differences are preferable to quotients. 

In order to see how Pintner’s differences compare with achieve- 
ment quotients, Table I is given. It presents the chronological ages 
and educational and mental point scores of ten pupils and the ages, 
indices, differences, and quotients computed therefrom. The point 
scores given in the third and fourth columns have been transmuted 
into the achievement ages and mental ages given in the next two col- 
umns, and likewise into the educational and mental indices in columns 
seven and eight. Finally, the last two columns give the differences 
by Pintner’s method and the achievement quotients. It will be seen 
that on the whole there is a fair amount of agreement between the 
results obtained by the two methods. Pupils A, B, C, and D, in whose 
cases the differences are only one, either plus or minus, have A.Q.’s in 
no case more than four points above or below 100. Similarly those for 
whom the differences are greater have A.Q.’s that differ from 100 


ES 0% ley. 


A Stupy or Measures oF ACHIEVEMENT RELATIVE TO CAPACITY 27 


TABLE I. COMPARISON OF DIFFERENCES FOUND BY PINTNER’S 
METHOD WITH ACHIEVEMENT QUOTIENTS 


GFA Educ. Ment. A.A, M.A. 

Pupil in’ | Point | Point | in ey tere rece) ies) 
Months | Score Score | Months |} Months ore no ; ae 

pero cccslida sce 184 70 392 186 178 47 48 = 
Be cca eveieus 152 51 308 139 135 43 42 VW We 
Ce ees 144 59 361 149 -152 S52 53 =—1 98 
1D  oryete ren 138 64 377 154 160 60 59 +1 96 
BS Saetegecie 136 42 346 129 145 46 55 —=9 89 
Baro cy heres 132 31 338 117 142 41 56 —15 82 
Gaeeeieries 128 35 225 122 112 46 43 +3 109 
Ue rays a 124 24 182 108 101 41 39 2 107 
ES heiccaicrs 114 18 113 102 86 42 35 +7 119 
Ji ciedesetvercuaasvs 108 4 18 85 66 27 16 +11 129 
Meanie rice 136 40 266 129 128 45 45 0 104 


by larger amounts. The relationship is by no means perfect, however. 
Thus pupils A and C both have differences of —1, whereas one has 
an I.Q. of 104, the other of 98. Also the achievement quotient of 
pupil F, whose difference is — 15, falls only 18 points below average, 
whereas pupil J, with a difference of only + 11, has a quotient 29 
points above average. The coefficient of correlation between the two 
is .94, which means that predictions of A.Q.’s from differences, or 
vice versa, would be about one-third pure guesses.’? In other words, 
the conclusions drawn would sometimes differ markedly according to 
which measure was used. 

In the foregoing discussion of the relative merits of difference 
and quotient measures no attention has been paid to reliability. There 
appears to be practically no difference in the reliability of measures of 
the two kinds computed from the same original data; therefore, 
neither is to be preferred to the other on this basis. Evidence to 
support this statement may be found in Appendix C. 

Torgerson’s efficiency quotient. The efficiency quotient suggested 
by Torgerson®® has been rarely, if ever, employed by others, and does 
not seem to deserve a better fate. The only apparent argument in its 
favor as compared with the achievement quotient is that one can 
avoid computing achievement or educational ages by using Torger- 
son’s method. Since for many tests, all that is required to secure 
age norms is merely to read them off from a transmutation table, the 
amount of labor saved in such cases is negligible. Moreover, it is 

The coefficient of alienation, which equals V1 — 2, is a measure of departure from 
perfect correlation. lt gu dicates to what extent the OO lee toe gnTe 
ere re rors eau ee lay eoala be if the estimated -scores were 
pure guesses; that is, if no correlation at all existed. For a more complete explanation, see: 


Wiis, Cats Interpretation of the Probable Error and the Coefficient of Cor- 


relation ey of Illinois Bulletin, Vol. 23, No. 52, Bureau of Educational Research 
Bulletin No. 32. Urbana: University of Illinois, 1926, p. 41-45. 


20See p. 15f. 


28 Butietin No. 45 


TABLE II]. COMPARISON OF TORGERSON’S EFFICIENCY QUOTIENTS 
witH ACHIEVEMENT QUOTIENTS 


ee 0—_—>qoaoeooo=os 


C.A. M.A. : A.A, Torger- Torger- 
Pupil in in 1.Q. ron in son’s son’s A.Q. 

Months | Months Months A.Q. Eff.Q. 
DA eteree aietecas 195 190 97 70 183 108 111 96 
B 188 163 87 102 231 iad 180 142 
Ce aiccrsteverore 188 140 74 54 159 85 115 114 
Pasir tence 178 162 91 78 195 120 132 120 
Be cg aes 168 164 98 69 182 106 108 111 
Berets edhe avs 165 151 92 34 129 SZ 57 85 
Giaiwwenes 164 178 109 7) 188 112 103 106 
EU eterents nee 164 192 117 Sit 125 48 41 65 
Dsieke Sesicyress 158 164 104 78 195 120 115 119 
Rec oros eta 155 205 132 113 248 174 132 121 
Mean... 172 171 100 70 184 108 109 108 


very often desirable to turn point scores into age scores irrespective of 
whether or not quotients are later to be computed. Probably the chief 
adverse argument is the same as that stated in the case of Pintner’s 
method, that it is not so readily understood as the ordinary age and 
quotient procedure. ‘ 

The differences between efficiency quotients computed by Torger- 
son’s method and achievement quotients found by Monroe’s or the 
ordinary method may be shown by the figures for ten eighth-grade 
pupils given in Table II. The first column of figures gives the chron- 
ological ages of the pupils in months. Next are their mental ages 
determined from intelligence test scores, and their intelligence quo- 
tients computed, of course, by dividing the mental ages by the chron- 
ological ages. In the next column are their point scores upon a 
subject-matter test and then the achievement ages equivalent to these 
point scores. Following these are what Torgerson calls achievement 
quotients, found by dividing each pupil’s point score by the norm for 
the grade, which in this case is 65. The next to the last column con- 
tains Torgerson’s efficiency quotients obtained by dividing his achieve- 
ment quotients by the intelligence quotients. Finally come achieve- 
ment quotients computed in the usual manner, that is, by dividing 
achievement ages by mental ages. 

It will be seen that for this small group of pupils with a mean 
I.Q. of 100, the two methods yield approximately the same average 
results, the average Eff.Q. being 109 and the average A.Q. 108. A 
comparison of the quotients of the individual pupils, however, reveals 
some rather large differences, even though the coefficient of correlation 
between the two is .97. For pupils C, E, G, and I the differences 
are less than five points, but for others they are much larger, those 
for B, F, and H being 24 or more points. The chance element?! in 


21See footnote on p. 27. 


A Stupy or Measures oF ACHIEVEMENT RELATIVE TO CAPACITY 29 


predicting one from the other is almost one-fourth. Furthermore, it 
appears that on the whole this method has the undesirable effect of 
tending to increase high quotients and decrease low ones. Although 
the extreme A.Q.’s are only 65 and 142, the extreme efficiency quo- 
tients are 41 and 180. Also pupil F, with an A.Q. of 85 has an Eff.Q. 
of only 57, whereas pupil D’s 120 and pupil J’s 121 both become 132. 
It happens that in the case of the achievement test of which the scores 
are used in this illustration, the relationship between point scores and 
achievement ages is rectilinear. In other words, a certain difference 
in point score is always equal to the same difference in achievement 
age regardless of where it occurs, one point equalling one and one-half 
months. In the case of an achievement test concerning which this 
is not true, but for which the line of relationship is curvilinear, the 
differences between ordinary achievement quotients and Torgerson’s 
efficiency quotients would tend to be even greater and more irregular 
than those in the example above. 

Peters’ accomplishment quotient. Although it is true, as Peters 
points out, that the ordinary accomplishment quotient is not very satis- 
factory for high-school use because the age norms upon which it must 
be based are unsatisfactory, yet Peters’ proposal’? does not seem to 
meet the need. The chief objection to it is that it is too complex and 
dificult for common use. Research workers and others engaged in 
experimentation might employ it, but the ordinary classroom teacher, 
supervisor, or administrator can hardly be expected to do so. 

A second, though less important, objection is that the formula 
given by Peters is inapplicable to certain cases, and in others yields 
results which are evidently not valid. The reason this objection is 
not more important is that such cases are decidedly unusual. However, 
there would probably be a few of them in every group of several 
hundred or more students, and perhaps one in almost every classale 
will be recalled that the formula which Peters recommends is 


> Ale 
cat Ge Ape) (pe ea), 


A 
A.Q. = 
© I 


in which A equals the academic mark in standard deviation units, I 
the intelligence test score in such units, the plus sign is to be used 
if I is greater than five, and the minus sign if it is less than five. If 
either A or I in the formula becomes nine or one, the denominator 
in the fraction becomes zero and, therefore, the value of the fraction 
and also of the A.Q. equals infinity. In case both A and IJ are slightly 


2See p. 16f. 


30 Butietin No. 45 


above or below 1.00, the result is a negative achievement quotient, 
which is patently impossible. For example, if A equals .70, and I 
equals .80, the A.Q. value given by the formula is — 1.29.%* If both 
A and I have values slightly above or slightly below 9.00, the formula 
yields a ridiculously large result. For example, if A equals 9.2, and 
I equals 9.1, the resulting A.Q. is 7.51, which is manifestly absurd, 
especially in view of the fact that the difference between A and I is 
OnlyeL. 

Otis’ accomplishment quotient. The method suggested by Otis 
and described by Peters** is open to the same chief objection 
as is that proposed by Peters himself. In other words, it is 
too difficult for regular classroom use. Moreover, from the statis- 
tical standpoint it appears to be inferior to Peters’ proposal. After 
presenting it, Peters goes on to say that although it has statistical 
plausibility, there are several serious objections. First, accomplishment 
quotients according to it would have no standard meaning, since their 
range depended upon the degree of correlation between intelligence 
and achievement. If the test were perfect, the student at the top in in- 
telligence could not secure an A.Q. of more than 1.00. Second, if the 
mean accomplishment corresponding to a particular degree of intelli- 
gence is taken as normal for that degree, the assumption is made that 
the whole lag is due to lack of effort, but as a matter of fact most 
of it is due to the inferiority of the test. We should divide a measure 
of what one does achieve by a true measure of what he is able to 
achieve, which is not done by Otis’ formula. Third, gratuitous ad- 
ditions to A.Q.’s are made for no satisfactory reason. For example, 
if r= .60, a student whose intelligence and achievement were both 
+1o would get an A.Q. of 1.07; if they were both 20, of 1.11; 
whereas in both cases he should, of course, have an A.Q. of 1.00. 
Fourth, the method is clumsy, because a new formula would have 
to be made for every different test used and every different school 
that employed the method. Furthermore, the empirical adjustment 
to secure zero correlation between A.Q. and I.Q. renders the pro- 
cedure too difficult to expect its common use. Fifth, Peters’ method 
is closely analogous to the usual one for A.Q.’s, whereas that of Otis 
gives a new meaning to the resulting quotient. 

It seems to the writer that on the whole Peters’ objections are 
well founded. Probably the most weighty of these objections is that 
the adjustment to secure the desired zero correlation is decidedly 


*3It will be recalled that Peters does not omit the decimal point in writing the accom- 


plishment quotient, so that —1.29, for instance, is the same as the more usual —129 
24See p. 17£. , 


A Stupy or MEAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 31 


difficult, and also that a new formula must be made for each different 
test and school. Very few persons in ordinary public-school work 
would be willing to go to this amount of trouble to compute quotients. 

Symonds’ index of effort. The index of effort or of studiousness 
suggested by Symonds” is a rather general expression that may be 
applied to any measure which accomplishes the desired result. Of the 
two methods he suggests for computing it, the first is a fairly good 
one for rough work. The index found by it is easily computed and 
understood. On the other hand, the same objection applies to it as 
to all other measures based upon mere ranks rather than upon exact 
scores. This is that if two pupils rank next to each other, it makes 
no distinction according to the size of the difference. For example, 
if the best two pupils in one group have scores of 48 and 47, respec- 
tively, and the best two in another group of 48 and 40, the differences 
in rank will be the same although the difference in scores is eight 
times as great in the second case as in the first. Also, indices secured 
by this method do not mean the same when different numbers of 
pupils are concerned. An index or difference of a given amount is 
much more likely to occur if the pupil for whom it is computed is 
one of a large group than if he belongs to a small group. That is to 
say, the difference between first and second rank in a group of ten 
is ordinarily much greater than in a group of 50 or 100. On the aver- 
age, the difference between the same two ranks in groups of different 
size varies inversely as the size of the groups. Hence, in general, the 
difference between ranks 1 and 2 in a group of ten would be five 
times as great as in a group of fifty. The writer recommends, there- 
fore, that if this index is used, the difference in ranks be divided by 
the number of cases and the result multiplied by 100, to eliminate 
decimals. In formula form, Index — “1009, in which R, equals 


rank2® in achievement, R, in intelligence, and N the number of indi- 
viduals in the group. 

The second method suggested by Symonds is considerably more 
difficult than the first and not readily understood by persons who have 
not had some statistical training. It is, however, not as complicated 
as one or two other methods that have been suggested, and from a 
statistical standpoint appears to be sound. However, the writer doubts 
if the rank and file of teachers or even of supervisors and administra- 


2Sce p. No : , 

26—n using this formula, a : ¢ 
rank. For Se male, in a group of 25 pupils, rank 25 denotes the 
best, and so on down to 1, which denotes the worst. 


b lue of R indicates high rank, and a small value low 
tora best, rank 24 the second 


32 Butietin No. 45 


TaBLE III]. COMPARISON OF INDICES OF EFFORT COMPUTED BY THE 
Two Meruops SUGGESTED BY SYMONDS 


Ach. Intel. Ach. Intel. S.D Diff. 
Pu- | Ach. | Intel. | ach, | Int. Tee Ds 
; oint | Point P.S. P.S. S.D. S.D. iff. | Index | Index 
nil Sone Score Be Nae Dev. Dev. Dev. Dev. 
ARs tersis 36 178 10 8 +8 +19 | +1.5 +.9 +.6 56 +2 
Brea 33 166 9 6 +5 +7 | +1.0 +.3 +.7 57 +3 
Cotas 32 181 8 9 +4 +22 +.8 | +1.0 —.2 48 —1 
2D acca 30 165 7 5 +2 +6 +.4 +.3 +.1 51 +2 
i Deatitto 29 186 6 10 +1 +27 +.2 | +1.2 —1.0 40 —4 
Becher 28 170 ) 7 0 +11 0 +.5 —.5 45 —2 
Gare. PAE 154 4 4 =—1 —5 =—.2 —.2 0 50 0 
A ciecre 26 138 3 2 —2 —21 —.4 —.9 +.95 55 +1 
1 a Or Pap) 142 2. 3 —6 —17 | -—1.2 —.8 —.4 46 -1 
a Aci 17 110 1 1 —11 —49 | —2.1 —2.2 +.1 pl 0 
Mean 28 159 bIo8) bie) 0 0 0 0 0 50 0 


| 


tors can be brought to use a method that requires the computation of 
the standard deviation and the transmutation of scores or differences 
into standard deviation units. 

In order that these two indices may be compared, Table III is 
given. Near the left of this table may be found the achievement 
and intelligence point scores of ten pupils. The next two columns 
contain their ranks, 10 being highest and 1 lowest. The next pair 
contain the deviations from the means, which are, respectively, 28 
and 159. These deviations have been divided by the standard devia- 
tions, 5.2 for achievement and 22.2 for intelligence, and the results 
entered in the next two columns. Under the heading “Diff.” may be 
found the differences between the entries in the two preceding 
columns. These differences are then multiplied by ten and added 
algebraically to fifty, as called for by Symonds’ second method, 
and the results given in the next to the last column. The last 
column contains the differences according to his first method, 
these being found by subtracting the intelligence rank from 
the achievement rank of each pupil. A comparison of the 
last two columns shows a tendency toward agreement, although it 
is not perfect. For example, pupils D and J both have indices of 
51 according to the second method, but the first has +2 and the 
second 0 by the other. Likewise, pupils A and D both have indices 
of -++ 2 by the first, but 56 and 51, respectively, by the second. The 
coefficient of correlation in this case is .95; hence the chance element?? 
in predicting either index from the other is about one-third. No at- 
tempt has been made to compare Symonds’ suggested measures with 
achievement quotients, since the latter are commonly not found for 
high-school subjects. Indeed, Symonds does not recommend his in- 


27See footnote on p. 27. 


A Stupy or MEAsuRES OF ACHIEVEMENT RELATIVE TO CAPACITY 33 


dex as better than the quotient, but only as usable where the latter 
is not. 


Nygaard’s accomplishment quotient. In support of his proposal 


AVA. ‘ AA 
that A. = t d f Bats oy 
2 Predicted A.A. AED Rie Nygaard advanced 


several arguments. One of these is that the average A.Q. of any 
group will be 100 irrespective of how it ranks. This is true, but not, 
therefore, necessarily desirable. A.Q.’s computed on this basis would 
allow valid comparisons to be made within the class or other group 
in question, but not between members of it and those of other groups. 
For some purposes, it may be desirable to determine how well pupils 
are doing with regard to all factors that condition learning, such as 
effects of home training, subject-matter studied, teacher’s ability, and 
habits of study. Such a measure as Nygaard’s A.Q. would determine 
this rather well, but it would not at all show how well a teacher was 
realizing on the capacities of her pupils. If because of very poor in- 
struction, or, for that matter, of any other reason, her class as a 
whole was doing very poor work, A.Q.’s computed according to Ny- 
gaard’s method would not reveal this fact. On the whole, it seems 
much more desirable to make use of A.Q.’s that permit valid com- 
parisons between all pupils regardless of whether they are in the 
same small group or not. 

A second argument advanced by Nygaard is that the negative 
correlation between A.Q.’s and 1.Q.’s will be eliminated.” Tor the 
writer, this also seems to possess little or no validity. It is, indeed, 
generally recognized that it would be desirable to alter the fact that 
on the whole bright pupils do poorer work in relation to their capacity 
than do dull pupils, but the mere use of a statistical method or device 
which eliminates this negative correlation without changing the actual 
quality of the work done appears to have no merit. Indeed, it may 
be argued that it is positively undesirable, since it tends to conceal 
true conditions. Rather than to attempt to alter the correlation be- 
tween the A.Q. and the I.Q. by statistical devices, one should follow 
the suggestion made by Torgerson,®° as well as others. This is that 
the negative correlation be eliminated by so classifying or grouping 


2—n order to compare the correlation between Nygaard’s A.Q. and the 1.0. with that 
between the ordinary A.Q. and the 1.Q., the writer had both computed for a group of several 
hundred fourth-grade pupils. The coeficient of correlation between ordinary A.Q.’s on the 
Stanford Arithmetic Test, Form A, and I.Q.’s computed from the National Intelligence Tests, 
Scale A, Form 1, was —.42 and that between Form B and Form 2, respectively, was —.58. 
C i i ll the general trend of the many reported correlations be- 


Oy. in both cases very small and positive, being respectively .12 and .07. | 
te sw Torgerson ie aes ‘Classification by Mental Ages and Intelligence Quotients Worth 


While?” Journal of Educational Research, 13:171-80, March, 1926. 


34 Bu tietin No. 45 


pupils, so planning the work they are to carry, and so instructing them, 
that all come as near as possible to realizing their capacities to the 
fullest extent, and that in so far as this ideal goal is not reached, it 
be approached to approximately the same degree for pupils of all 
degrees of intelligence. 

It is a merit of Nygaard’s formula that it makes allowance for 
the fact that the range of achievement ages of a group is usually not 
equal to that of its mental ages. As Miss Rand** and Kelley** have 
pointed out, this condition constitutes a more or less serious statistical 
objection to the A.Q. as usually computed.** It does not seem to the 
writer, however, that this merit of Nygaard’s proposal is of sufficient 
importance to justify its substitution for the ordinary method. 

Summary. In this chapter, the writer, has dealt with certain con- 
troversial questions having to do with measures of achievement rela- 
tive to capacity. In the first place, he recommends that “achievement 
quotient” be used for the comparison of achievement with capacity— 
that is, potential mental ability—and that “subject quotient” and “edu- 
cational quotient” be used for the comparison of achievement with 
chronological age. The greater usefulness and significance of com- 
paring achievement with mental age than with chronological age is 
emphasized, and certain arguments that have been advanced in favor 
of the latter comparison answered. It is shown that Franzen’s con- 
cept of an A.R. or an A.Q. of 100 as maximum is erroneous, but that 
instead 100 is the average. The recommendation is made that quotient 
measures rather than difference measures be employed chiefly be- 
cause their use is already much more common. Following this, the 
suggestions of Torgerson, Peters, Otis, Symonds and Nygaard are 
considered critically. Of these only that of Symonds, who proposed 
an index of effort for use above the elementary school, is considered 
to have much practical merit. 


31Rand, Gertrude. “A Discussion of th i ifyi a 
Journal of Educational Psychology, 16:800.618> December Teens sii yaar 


Spee T. L. Statistical Method. New York: The Macmillan Company, 1923, p. 


“This statistical consideration is discussed more fully on p. 36f. 


CHAPTER IV 


THE VALIDITY OF MEASURES OF ACHIEVEMENT 
RELATIVE TO CAPACITY 


Problem of this chapter. Inasmuch as the validity’ of most of the 
various proposed measures of achievement relative to capacity de- 
pends upon the same conditions and assumptions, it has seemed de- 
sirable to treat this question in a separate chapter and not in connec- 
tion with the comparative merits of the various measures. In general, 
it cannot be said that any one of the measures named and discussed 
in the first three chapters is to be preferred to the others because 
it is more valid. Practically all of those suggested are subject to limi- 
tations of this sort, and it is the purpose of this chapter to point out 
what these are and also to suggest how they may, to some extent at 
least, be avoided. In other words, methods of computing such 
measures that avoid, or partially avoid, these limitations, will be de- 
scribed and criticized. 

In order that measures of achievement relative to capacity be 
valid, it is not sufficient that the separate measures of achievement 
and of capacity upon which they are based be valid. Several other 
conditions must be met. One is that, as Sherrod? and Popenoe® have 
pointed out, the age norms upon which achievement quotients are 
based must be perfect if the quotients are to be entirely valid. In 
other words, the basis of transmutation of point scores into mental 
and achievement ages must be perfect. A further condition, pointed 
out by Rand* and Kelley’ among others, is that the units employed 
in both numerator and denominator or, in other words, in both achieve- 
ment age and mental age, must be equivalent. Still another objection 
to the validity of the A.Q. has been advanced by Goodenough.’ She 
points out that the A.Q. is based upon the assumption that achieve- 
ment age develops parallel with mental age from birth, whereas in 
reality it ordinarily lags a great deal behind mental age until the be- 


1For a definition of validity see Dp. 1 
2Sherrod, C. C. ‘The Matar 
Journal of Education, 1:44-49, July, : ; ; ; ; 
Onda a Herbert. “‘A Report of Certain Significant te eae of the Accomplish- 

t tient,’ Journal o Educational Research, 16:40-47, June, 1927. . 
ay ae i f the Quotient Method of Speci 


ile 
of the Idea of Quotients in Education,” Peabody 


fying Test Results,” 


4Rand, Gertrude. ae Dicuss of Bee ie 
Journal o Educational Psychology, : - , December, ; 
Uae T. L. Statistical Method. New York: The Macmillan Company, 1923, p. 
gai é “Efficiency in Learning and the Accomplishment Ratio,” Journal 


6Goodenough, F. L. 
of Educational Research, 12:297-300, November, 1925. 


35 


36 Butretin No. 45 


ginning of attendance at school. Wilson’ has likewise advanced a 
very similar objection. Since the first of these conditions, that of per- 
fect age norms, is a matter of achievement and mental ages rather 
than of quotients, it will not be further discussed here. The others, 
however, will be elaborated in the following paragraphs. 

Necessity of equivalent units for achievement age and mental age. 
Probably the best general discussion of this point is by Kelley,* who 
points out that three conditions must be met before two scales are 
entirely equivalent. As applied to achievement quotients or similar 
measures, this means that achievement scores and intelligence scores 
must both meet these three conditions in order to make dividing the 
former by the latter justifiable. The three conditions are that one 
point of the first scale must be known to be equal to a point of the 
second, also a second point of the first equal to a second point of the 
second; and that the law of relationship between successive points 
must be the same for the two scales. Kelley does not make any par- 
ticular application to achievement and intelligence tests, but gives 
other illustrations showing very clearly the hazards involved and the 
misleading conclusions that may be drawn if these conditions are not 
met. 

Miss Rand’s discussion® of this point is perhaps more concrete 
than Kelley’s, upon whose treatment she bases hers. She stated 
that “We are early taught that we must not divide months by years, 
grams by ounces, centimeters by inches. Why, then should we divide 
E.Q.’s by I.Q.’s or E.A.’s by M.A.’s without proof of their equiva- 
lence ....?” Following this, some evidence is offered that the E.Q. unit 
is smaller than the I.Q. unit. This evidence includes two quotations 
from Burt,’® stating her contention as a fact, and data from four 
or five studies. From Ruch! she quotes standard deviations of the 
I.Q. of 14.2 and similar deviations of the E.Q. of 10.4, 12.0, and 16.3. 
In connection with the New Jersey Composite Test,!* she cites figures 
indicating that the educational test unit is only about two-thirds that 
of the intelligence test unit. For the Lippincott-Chapman Classroom 
Products Survey Tests'® she computed standard deviations and com- 

Tilson, W. R. “The Misleadin 


Research, 17:1-10, January, 1928. 
8Kelley, loc. cit. 


g Accomplishment Quotient,” Journal of Educational 


cit. 
Burt, Cyril. Mental and Scholastic Tests. London: P. S. King and Company, 1922, 


MRuch, G. M. “The Achievement oti i 
chology, 14: 334.43, September, 1923. epee CES 


‘Preliminary Standardization of the New Jersey Composi 2 
Jersey: Department of Public Instruction, 1923. i 3 os eee Bien re 


%Chapman, J. C. “Lippincott-Chapman Cl : a i ia: 
WEEN G cca Gonpany nose. pman Classroom Products Survey Tests.” Philadelphia: 


” Journal of Educational Psy- 


A Stupy or Measures OF ACHIEVEMENT RELATIVE TO CAPACITY 37 


pared them with those for the Terman Group Test of Mental Ability7* 
showing that the ratio between the two is approximately nine to 
thirteen. Finally, a reference is given to Kelley,1> who reported a 
smaller standard deviation for E.Q.’s from the Stanford Achievement 
Test than is commonly found for I.Q.’s in the same grade. Miss 
Rand’s conclusion appears to be that accomplishment quotients should 
not be employed, because their erroneous interpretation and use more 
than outweighs their practical value. 

Miss Rand’s contention that the units employed in expressing 
achievement and intelligence test scores are not equivalent is, in the 
opinion of the writer, entirely justified. Nygaard’* recognized it when 
he urged as one of the merits of his proposal that it would correct 
for this lack of equivalence. Certain data which the writer has com- 
piled tend not only to confirm Miss Rand’s position, but to make the 
situation appear even worse than she has portrayed it. The standard 
deviations of achievement ages computed from scores upon the Stan- 
ford Arithmetic Test, the Monroe General Survey Scale in Arithmetic, 
and the Monroe Standardized Silent Reading Tests, Revised, differ 
considerably from those for the National Intelligence Tests, Sealezae 
and the Illinois General Intelligence Scale administered to the same 
pupils. The ratio of the standard deviation of the Stanford Arith- 
metic Test to that of the National Intelligence Scale was found to 
be about three to four. For the other subject-matter tests mentioned, 
the standard deviations were approximately twice as great as that 
for the Illinois General Intelligence Scale. Since these results are 
based upon two testings of several hundred pupils in the case of each 
test, they may be considered fairly reliable. Certainly the differences 
are great enough that the statement seems justified that the units do 
differ so greatly as not to approach equivalence, although all the differ- 
ences are not in the same direction. Moreover, the fact that most 
of the differences are in the opposite direction from those reported by 
Miss Rand, as well as the fact that all those reported by the author 
do not agree, makes the situation even worse with regard to the A.Q., 
since it appears that in some cases the unit in the numerator is the 
greater, in others that in the denominator is the greater. 

To remedy the situation, Miss Rand proposes a program of re- 
construction including two plans. The first is that all test scores 
be expressed in terms of “an arbitrary scale of values having a fixed 


14Data on the Repetition of Certain Mental Tests,” Journal of Educational Research, 


: q 1923. ae 4 ‘ 
een a “A New Method for Determining the Significance of Differences in 


Intelligence and. Achievement Scores,’ Journal of Educational Psychology, 14:326, Sep- 


tember, 1923. 
See p. 19. 


38 Butietin No. 45 


zero and a fixed scale number or unit which is to be applied to each 
o or fraction of « of score above the zero.” She refers with apparent 
approval to a suggestion of Rugg’s*’ that such a scale range from 
—2.5¢ as zero, to +2.50 as 100, thus letting ten scale points represent 
each .5e. She mentions the possibility of employing McCall’s T-scale,*® 
but states her opinion that it would be better to make a suitable scale 
for each age at which a test is being used rather than to employ a scale 
based on twelve-year-old performance for all ages. 

Her second suggestion is that quotients for other tests should be 
arbitrarily made to conform to Stanford-Binet 1.Q.’s by the proper 
statistical transmutations to make them equivalent thereto. These, of 
course, would be based on the ratios of the Stanford-Binet 1.Q.’s to 
those of the tests in question. For this procedure also, she suggests 
that the proper basis of transmutation for each age be determined and 
used. 

Miss Rand’s proposals appear to be statistically satisfactory, but 
it is doubtful if they will ever receive wide use because of the amount 
of computation necessary. Any measure that is to receive general 
acceptance must be fairly simple to compute and understand, and hers 
suffer decidedly in these respects by comparison with the simple 
achievement quotient and other measures. 

Goodenough’s and Wilson’s argument against the validity of the 
A.Q. As has been mentioned already,!® Miss Goodenough has shown 
that the achievement quotient as ordinarily employed is not entirely 
valid. She centered her attack on the point that “the accomplishment 
ratio does not afford a valid means for comparing the learning effi- 
ciency of individuals or of groups who differ widely in intelligence.” 
Table IV, taken from her discussion, shows the various rates at which 
bright pupils must do their work to earn accomplishment ratios of 
100 at various mental ages. From this it can easily be seen that in 
order to maintain equal A.R.’s, bright pupils must learn at a much 
greater rate than dull pupils. For example, a pupil with a mental 
age of 10 and an I.Q. of 120 has, on the average, been in school only 
4 semesters, whereas one of the same mental age but with an I.Q. of 
80 has attended 12 semesters or three times as long.2° Therefore, if 
the two started at the same point when they entered school, the bright- 


WRugg, H. O. Statistical Methods Applied to Education. 2 iffli 
Gas OG pplied to ucation. Boston: Houghton Mifflin 
WSee p. 8. 
19See p. 35f. 
vi *The first pupil mentioned has a chronological age of ey or 814 years, the second of 


1.20 
0 Of 12.5 years. Thus, the former has probably spent 2 years, or 4 semesters in school, 


the latter 6 years, or 12 semesters. 


A Stupy or MEASURES OF ACHIEVEMENT RELATIVE TO CAPACITY 39 


TABLE IV. LENGTH OF SCHOOL ATTENDANCE AT VARIOUS MENTAL 
AGES FOR CHILDREN OF DIFFERENT INTELLECTUAL LEVELS® 


Length of Attendance in Semesters 


Intelligence Quotient if Mental Ages are as Indicated 
8-0 10-0 12-0 14-0 16-0 
(0) 2 5 7 10 
1 4 7 11 14 
3 Lee 11 15 19 
7 12 17 
14 


Table reads: A child whose intelligence quotient is 140, if he hasa mental age of 8, will normally 
have had no school attendance; if he has a mental age of 10, he will have attended 2 semesters; if he 
has a mental age of 12, he will have attended 5 semesters; etc. 

: aGoodenough, F. L. ‘‘Efficiency in Learning and the Accomplishment Ratio,” Journal of Edu- 
cational Research, 12:299, November, 1925. 


er one must cover the work three times as fast as the duller one, 
although his I.Q. is only one and one-half times as great. 

Wilson?! has given attention to this same point, that the achieve- 
ment quotient is not valid for comparing pupils of different degrees 
of ability. In his discussion he offers an elementary proof that the 
accomplishment ratio method, as he calls it, results in a spurious in- 
crease of the A.Q.’s of pupils of below average ability, and a decrease 
for those above. To prove this, Wilson took the 48 pupils in Franzen’s 
group and assumed that their obtained I.Q.’s were perfectly accurate 
and that their true efficiency quotients were the same as their intelli- 
gence quotients; in other words, that the true accomplishment quo- 
tient of each pupil was 100. On the assumption of a probable error 
of three points, which is much smaller than is usually found in actual 
practice, for both I.Q. and E.Q,, and a random distribution of errors, 
he obtained intelligence and efficiency quotients such as might be 
secured in actual testing and from them computed accomplishment 
quotients. Although the true A.Q.’s were all 100, those obtained by 
his method ranged from 85 to 117. Furthermore, although the cor- 
relation between true A.Q.’s and I.Q.’s was zero, that for the obtained 
A.Q.’s and_I.Q.’s was 38. Also he gave a simple geometrical proof 
that the correlation between actually obtained LOVs- and “A:Ovs must 
be negative. He concluded that our present measures of ability and 
achievement are so lacking in reliability, and perhaps also in validity, 
that “they cannot safely be assumed to tell with definiteness anything 


concerning the true accomplishment quotients of the students meas- 


ured.” 
Summary. In this chapter, several conditions prerequisite to the 
validity of measures of achievement relative to capacity have been 


21Wilson, W. R. “The Misleading Accomplishment Quotient,” Journal of Educational 
Research, 17:1-10, January, 1928. 


40 Butietin No. 45 


stated, and it has been shown that ordinarily these are not satisfactori- 
ly fulfilled. The age norms upon which achievement quotients are 
based are not perfect. The units in which mental ages and achieve- 
ment ages are expressed are frequently not equivalent. The implied 
assumption that the achievement age should develop parallel with 
mental age from birth is not true to the facts. One or two proposals 
for improving the situation by rendering achievement quotients statis- 
tically valid in so far as certain points are concerned have been given. 
It does not appear, however, that these proposals are likely to re- 
ceive wide use because of their lack of simplicity, although either one 
of Miss Rand’s suggested procedures would remove the chief statis- 
tical hindrance to the validity of the A.Q. On the whole, the con- 
clusion appears inevitable that if the achievement quotient or any 
similar measure is to be employed, it must be only in a very cau- 
tious and general manner, since its validity is too low to justify more 
exact use. 


CHAPTER V 


THE RELIABILITY OF MEASURES OF ACHIEVEMENT 
RELATIVE TO CAPACITY 


Problem of this chapter. It is the purpose of this chapter to pre- 
sent evidence and draw conclusions as to the reliability! of measures 
of achievement relative to capacity, especially of the achievement quo- 
tient. Results given by several other writers will be briefly presented, 
as also will be some hitherto unpublished data gathered by the present 
writer, and finally the conclusions that seem warranted will be stated. 


The bases of reliability. Undoubtedly, the primary basis of the 
reliability of achievement quotients or other measures of achievement 
relative to capacity is that the achievement and mental scores or, in 
other words, the achievement and general intelligence tests upon which 
they are based, be themselves reliable. This has been pointed out by 
a number of writers, including Toops and Symonds,’ Chapman,’ Bee- 
son and Tope,* Foran® and Herring.® It is evident, without going 
into the matter from a statistical standpoint, that a quotient or other 
quantity computed from two unreliable quantities will tend to be 
more unreliable than either one of them, since in many cases positive 
errors will be added to positive ones, and in many others, negative 
to negative. Such statements as Herring’s’ that “accomplishment 
differences are comparatively reliable when the tests employed are com- 
paratively reliable” are not justified unless the word “comparatively” 
is interpreted more loosely in one place than in the other,» Nerther 


1As was stated in the explanation of reliability given on p. 11, a test or measure 1s 
said to be reliable not only if a second application yields the same scores as the first, but 
also if there is a constant and known difference between the two sets of scores. In other 
words, a test is reliable if variable errors—that is, errors which are more or less accidental 
and differ for the different individuals concerned—are eliminated. There may be constant 
errors, errors that tend to be the same for the whole group, present and yet reliability be 
perfect or nearly so. Such causes as too long or too short time limits, practice effect from 
having had a similar test very recently, and so forth, tend to raise or lower, as the case 
may be, the scores made by all members of the group tested. This effect, of course, carries 
over to achievement quotients or any other measures of achievement relative to capacity, and 
makes them less accurate though not less reliable in the technical sense of the term. It 
should not be overlooked, therefore, that relative measures may contain constant ects 
whether or not their reliability is high. lif iby ishvery high, such errors will be practically 
the only ones present; if not, they will be present in addition to the vatiableycerors. ; 

2Toops, H. A. and Symonds, P. M. “What Shall We Expect of the A.Q.?” Journal of 
Educational Psychology, 13:513-28, December, 1922; 14:27-38, Jansary, ate ea 

3Chapman, J. C. ‘‘The Unreliability of the Difference Between Intelligence an 


i i oy a Educational Psychology, 14:103-8, February, 1923. F 
ES Oa E. “The Educational and Accomplishment Quotients as 


an Aid in the Classification of Pupils,” Journal of Educational Research, 9:281-92, April, 


a ° imitati 1 Standards in Edu- 
F 0 M ng and Limitations of Scores, Norms, and Sta | 
oe Eee eaihone Ceres of he Educational ee Vol. 3, 
ington: ic Education Press, February, 1926, p. =19, -26. 
a eae fe. Ee TR citability of Accomplishment Differences,” Journal of Educa- 
tional Psychology, 15:530-38, November, 1924. 
"Ibid. 


41 


42 Buttetin No. 45 


can Wilson’s assumption’ that a coefficient of reliability? of test scores 
of .90 is fairly satisfactory be considered valid. For such a coeffi- 
cient, the element of uncertainty or guessing’? in the scores is about 
four-ninths. Since the element of error in a quotient is greater than 
that in either the numerator or denominator, and therefore greater 
than four-ninths for a reliability of .90, it can scarcely be said that 
such a coefficient produces a reliability of achievement quotients that 
is at all satisfactory. Herring’? has suggested that the requisite re- 
liability of test scores to yield satisfactory A.Q.’s should be .95. This 
is probably as high as can be hoped for with even our best tests,** 
and yet the element of guessing connected with a coefficient of corre- 
lation of .95 is almost one-third, and consequently that with the re- 
sulting quotient still greater. Ruch*® has suggested that apparently 
the chief cause of too great unreliability of achievement quotients is 
that the tests upon which they are based are too short from the stand- 
point of time spent by the pupils in taking them. He says that appar- 
ently I.Q.’s should be based upon tests which require at least thirty 
minutes of working time. It does not appear, however, that this 
proposal is satisfactory, as it is well known that the reliability of most 
tests that consume thirty minutes or even more does not approach even 
.95 very closely. Practically all of the few tests which do equal or 
exceed this figure are either individual intelligence tests or group tests 
lasting two or three hours. 

It is true, as Ruch points out, that one important factor in relia- 
bility is frequently the short time limit of a test. For exercises of the 
same sort, reliability increases as the square root of the ratio of the 
times. For example, the reliability of a test four times as long as 
another is twice as great. In practice, one of the most convenient 
ways of increasing reliability through devoting more time to testing is, 
as Beeson and Tope™ have pointed out, to employ average measures 


based upon several group tests or perhaps, in the case of intelligence, 
scores from individual tests. 


8Wilson, W. R. “The Misleadin 
Research, 17:1-10, January, 1928. 


®The coefficient of reliability is the coefficient of ¢ orrelat on betw 
tween the scores secured 
from two applications of the same test or duplicate forms hereof w n ho erio 
S 1 eof ithi as Et peri d. 


Herring, J. P. “The Reliabilit f A i i es,” 
Mee Pathalion vie 53045: November: 103ie ao ake Differences,” Journal of Educa- 


high seis Sa tests yield reliability coefficients as high as -95, or for that matter, even as 
BRuch, G. M. ‘The Achievement 
chology, 14:334-43, September, 1923. 


“Beeson, M. F. and Tope, R. E. “The Educational and Accom lishm i 
aga J 3 y 3X. Te t ot 
epee in the Classification of Pupils,” Journal of Educational Res tarch. Rae eek 


g Accomplishment Quotient,” Journal of Educational 


Quotient Technique,” Journal of Educational Psy- 


A Stupy oF MEASURES OF ACHIEVEMENT RELATIVE TO CAPACITY 43 


Reported data on the reliability of the A.Q. Among those who 
have reported such data is Symonds.’® As a result of employing the 
National Intelligence Tests, the Woody-McCall Mixed Fundamentals 
in Arithmetic, and the Thorndike-McCall Reading Scale, he found 
coefficients of correlation or reliability between achievement ratios 
from first and second forms of from .23 to .60, with probable errors’ 
of 6 or 7 points. He showed that if the tests were given twice or 
two forms used as one, the probable error of the A.R. became slight- 
ly less than 5 points in each case, or, in other words, probably about 
the same as that of an I.Q. based upon one of the best individual in- 
telligence tests. He next proceeded to consider the reliability of school 
marks based upon the achievement ratio, and showed that it is possible 
to adopt a five-letter marking system such that the probable error of 
the achievement ratio varies from about one-fourth to less than one- 
half of a letter interval. With other five and six-letter marking 
systems, however, the probable error of the A.R. may be as great as 
two-thirds of a letter interval. After comparing these figures with 
previously published data concerning the reliability of scores on the 
Woody-McCall Arithmetic and Thorndike-McCall Reading Scales for 
grade placement, he concluded that “the A.R. is more accurate a 
unit for marking than the score is accurate for placing in the proper 
grade.”’ Further, he stated that the reliability coefficients of the arith- 
metic A.R. (.60 and .49) were comparable with those of ordinary 
school marks. This, however, did not hold for the reading A.Q., for 
which the coefficients were only .34 and .23. 

Popenoe’” also has reported some information of this sort. |For 
600 pupils, he found a coefficient of reliability of the A.Q. of .28 
and a probable error of about six points. He mentioned also that 
several minor studies in which the mental age was kept constant so 
that only the numerator contained errors yielded coefficients of relia- 
bility of only about .50. 

There have not been a great many published reports dealing di- 
rectly with the reliability of achievement quotients. The two just 
referred to, those of Symonds and Popenoe, are two of the best, and 
also are thoroughly typical. Such figures as they give seem to show 


Symonds, P. M._‘‘The Accuracy of Certain Standard Tests for School Sectioning and 


no.” Journal of Educational Psychology, 15:423-32, October, 1924 

Mark Eh Secale ees is greater than half of the errors concerned, and less than the 
other half. Thus, in the example given above, a probable error of six or seven points means 
that the errors in half of the scores were less than this amount, and those in the other half 


ter. P Sas F A 5 
oe ‘fPopenoe, Herbert. “A Report of Certain Significant Deficiencies of the Accomplish- 


ment Quotient,” Journal of Educational Research, 16:40-47, June, 1927. 


44 Butitetin No. 45 


that the reliability of the A.Q. is so low that comparatively little confi- 
dence can be placed in it. 

The writer’s data on the reliability of the A.Q. As has already 
been stated, the writer wishes to supplement the rather meager pub- 
lished data upon the reliability of the A.Q. by presenting some which 
he has. recently compiled. These were obtained from two sources. 
The National Intelligence Test, Scale A, Forms 1 and 2, and the 
Arithmetic Examination of the Stanford Achievement Test, Forms A 
and B, were given to more than 200 fourth-grade children in an 
Illinois city. About the same time, the Illinois General Intelligence 
Scale, Forms 1 and 2, was administered to approximately 800 eighth- 
grade pupils in another Illinois city. Half of this latter group also 
took Forms 1 and 2 of Test II of the Monroe Standardized Silent 
Reading Test, Revised, and the other half took Forms 1 and 2 of 
Test II of the Monroe General Survey Scale in Arithmetic. In both 
cities, the regular classroom teachers gave the tests, thus reproducing 
ordinary conditions. The test scores were turned into achievement 
and mental ages in the regular manner, and then the achievement quo- 
tients computed. The coefficients of reliability of A.A.’s, M.A.’s, and 
A.Q.’s, their probable errors of measurement and certain other 
measures of reliability were computed and are presented in Table V. 

It will be observed that the body of this table is divided into two 
parts. In the first, that is, the one to the left, will be found the meas- 
ures of reliability of the achievement and mental ages of the pupils 
and in the other the corresponding measures for the achievement quo- 
tients. The first column in each half of the table contains the coeffi- 
cients of correlation, in this case coefficients of reliability, between the 
results from the first and second forms of each test. The next pair of 
columns, headed “k’’, contain the coefficients of alienation.18 Follow- 
ing these in each case may be found the probable errors of measure- 
ment.'® The next two columns contain the probable errors of measure- 
ment divided by the means, and finally the last two contain the proba- 
ble errors of measurement divided by the standard deviations.2° 


8Sce footnote on p. 27. 


!The probable error of measurement is the i i i i 

s probable error involved in estimating true 
scores from actual scores. For example, a probable error of measurement of 3.9, which is 
given in the table as that of the Stanford Arithmetic Age, means that if the pupils’ arithmetic 
ages made upon either form of the test be taken as their true arithmetic ages, the errors in 


the cases of half the pupils will be less than 3.9 months and those for the other half greater 


than this amount. It is ordinarily computed b Tae, fi 
; , ; rily | y the formula, P.E. 67-45) o.X/ ler ii 
mh ey the coefficient of reliability—that is, of Correlitieneeperuecn two Sov e "the 


*0Since the significance of a probable error of a given amount depends to i 
extent upon the size of the measure for which the varied has been toand it Sore e 
recommended that the probable error of measurement be divided by such a divisor that the 
result in any one case may be compared with that in any other. For example, a probable 
error of measurement of six months might be found for the achievement ages. of primary 


45 


A Stupy or Measures oF ACHIEVEMENT RELATIVE TO CAPACITY 


0 ise 0° Pee 9° por Nec oti 6 ord aoudstyaqwy] stout] 
ee 4 62° +0" Ly nase PAS iRN® ial cbaeag On aouast][ayUy [euoNeN 
FP’ 1D Laas o'et 18° 8s° 8e° or’ Vier cL” 69 see rar a7yey SUIpeoy so1u0 py 
SS 60° 76 £6° 6¢° (Aa 80° eSip (yay LSE esi ee aed uolsuay eidu0D, 

BuIpeoy so1uojy{ 
Gide 80° 9°6 $8" SSG 6£° LO" 0°9T vl” PAS eos 8a be ah WAeulyUY so1uo fy 
St so" oF 8° tS" Som £0° OY Son GS ne | MS Na SASUIYY PAOJUeyAS 
ee Ne ciuriod ur x 1 Z WW syjuour ut x x 
SNe he | Toe te Pi i | separ sy TLS Pe | ankle | Te 1 
$9109G JWUIIVONS sa100S ay 


WALI M AHL AM GANIVLAQ SAXOIG INAILON(G) GNV ANY JO ALITIGVITAY JO SAMNSVAYY “A AAV] 


46 Butietin No. 45 


It will be seen from the coefficients of correlation and also of 
alienation given in the table that the reliability of the achievement and 
mental ages is not very high. In the case of the achievement tests, 
the correlations range from .61 to .76, and the corresponding coeffi- 
cients of alienation from .65 to .79. That is, the guessing element?* 
ranges from about two-thirds to four-fifths. In the case of the two 
intelligence tests, the situation is somewhat better, although even with 
them the coefficients of alienation are as large as .57 and .64. The 
corresponding coefficients of correlation for the achievement quo- 
tients range from .39 to .58 and those of alienation from .81 to .93. 
In other words, in the very best case shown by these data, the guess- 
ing element involved in the achievement quotient is slightly greater 
than four-fifths, whereas in the worst case it is well above nine-tenths. 
The columns containing the probable errors of measurement and the 
quotients of these errors divided by the means and standard devia- 
tions, respectively, show the same tendencies in a different manner. 
For the Stanford Arithmetic Test they are relatively small, the prob- 
able error of the quotients obtained upon the test being less than 
five points, or, in other words, less than five per cent of the mean 
quotient and also less than half of the standard deviation of the quo- 
tients. For the Monroe Arithmetic and Reading Tests the actual 
probable errors of measurement are from two to three times as large 
as in the case of the Stanford Arithmetic, but when taken relative to 
the means of these tests, they are only from about one and one-half 
to two times as great. When divided by the standard deviations, 
they are nearly the same. Evidently, probable errors which amount to 
close to one-tenth of the mean scores are so serious that little depend- 
ence can be placed upon the reliability of the quotients. To compare 
the situation with a more familiar example, it is the same as if the 
probable error of measurement involved in measuring the heights of 
adults were six or seven inches. It is readily seen that for almost 
all purposes measurements of height of which half were in error by, 
more than six or seven inches would be practically worthless. It 
seems to the writer, therefore, that from these data one must conclude 
that achievement quotients based upon a single administration of the 
tests used are so much in error that one is rarely if ever justified 
in employing them for individual diagnosis. 


pupils and one of eight months for those of u 


( er-grade pupils. H 1 
age scores of the latter will be ordinaril an tee Dele Gres pale Caton 


: t i ) y about twice as great as those of the former, the 
relative error in their case will be smaller than in that of the primary children. The two 
quantities which have been suggested as divisors are the mean and the standard deviation. 


ach ha advantages over the other and neither 1s Dp 
E s certain Ss erfect; therefore it has seemed 


21See footnote on p. 27. 


A Stupy or Measures OF ACHIEVEMENT RELATIVE TO CAPACITY 47 


The reliability of average A.Q.’s for groups of pupils. The data 
so far presented, both from other sources and from the writer’s in- 
- vestigation, show that the A.Q.’s of individual pupils are decidedly 
unreliable, but do not deal with average A.Q.’s of groups of pupils. 
The conclusion just stated as to individual A.Q.’s, however, does not 
warrant the same conclusion concerning average A.Q.’s of groups. 
As has been pointed out by Kelley,?? Myers,”* and others, average 
A.Q.’s of groups are, on the other hand, ordinarily rather reliable. 
This conclusion is, of course, merely the application of an elementary 
statistical principle or formula, that the reliability of a measure in- 
creases in direct proportion to the square root of the number of cases** 
upon which it is based. This decidedly important point has apparently 
been overlooked by most of those who have written concerning the 
reliability of the A.Q. This is unfortunate since occasions frequent- 
ly arise in which it is desirable to employ the average A.Q. of a class 
or some other group without making any use of the A.Q.’s of its indi- 
vidual members. 

This conclusion is supported by the writer’s data shown in Table 
V. In the ordinary elementary school there will be very few classes as 
small as twenty-five and in many cases, they will be at least as large as 
thirty-six. Therefore, in accord with the statistical principle stated in 
the last paragraph, the reliability of average achievement quotients for 
usual elementary-school classes will be at least five or six® times as 
great as that of individual achievement quotients as shown in Table 
V. Dividing probable errors of measurement given in this table by 
five or six yields such errors of only one or two points, or, expressed 
otherwise, of about 1 or 2 per cent of the means, or one-tenth 
of the standard deviations. Errors of this magnitude are small 
enough that one is justified in placing considerable reliance upon the 
accuracy of average A.Q.’s. If differences in the average A.Q.’s of 
classes of approximately five points are found to exist, it will ordi- 
narily be rather safe to assume that they indicate real differences in 
the degree to which the classes as wholes have capitalized their ca- 
pacity to learn. For differences smaller than this, such a conclusion, if 
drawn at all, should ordinarily be merely tentative. 


2Kelley, T. L. Interpretation of Educational Measurements. Yonkers: World Book 
7-83 22-25. : , 
COREE Ra oe “The Accomplishment Ratio,” Research Bulletin of the Pennsylvania 
State Education Association, No. 3. Harrisburg, Pennsylvania: Pennsylvania State Educa- 
i iati anuary, 1928, p. 38-40. f ; 
ae eet caat fade above may perhaps be made clearer by an illustration. Since the 
square root of 25 is 5, the average A.Q. of a group. of 25 pupils will be five times as reliable 
as the A.Q. of one of the individual pupils composing the group. Therefore, if the average 
error in the individual A.Q.’s is, let us say, ten points, that in the group A.Q. will be only 


fifth as great, or two points. ‘ : 
ae Five ead six are the square roots of 25 and 36, respectively. 


48 Buttetin No, 45 


Summary. The reliability of an achievement quotient or other 
measure of achievement relative to capacity is primarily determined 


by the reliability of the original measures upon which it is based, and 
is less than that of either one of the two measures unless one or both 
happens to possess perfect reliability, which, of course, is never true. 
Very few of our standardized tests possess high enough reliability 
that the errors involved in achievement quotients computed from 
scores thereon may safely be neglected. A comparatively few data 
previously reported indicate that the coefficient of reliability of 
achievement quotients is probably in most cases below .50, and its 
probable error at least six or seven points. Data collected by the 
writer likewise yield results in entire agreement with these conclu- 
sions. Thus it may be said that all the available direct data as to the 
reliability of the A.Q. support the inferences drawn from the low 
reliability of tests, that only in very exceptional cases are the A.Q.’s 
of individual pupils reliable enough to furnish safe guides in dealing 
with such pupils. On the other hand, the average A.Q.’s of classes or 
larger. groups containing a number of pupils are probably reliable 
enough that we are justified in employing them as measures of the 
achievement of the group as a whole. 


CHAPTER VI 
SUMMARY AND CONCLUSIONS 


Because the use of various measures of achievement relative to 
capacity has been so widespread and also in general so non-critical, 
it has seemed worth while to prepare a critical study of such meas- 
ures. A brief account of the origin of the various suggested measures 
of this sort is followed by discussions of the merits and demerits of 
‘the different measures and of the validity and reliability of such 
measures in general. The following of the proposed measures have 
received fairly wide use: educational quotient, subject quotient, 
achievement quotient or accomplishment quotient, and accomplish- 
ment ratio. Pintner’s difference method, Torgerson’s efficiency quo- 
tient, Peters’ high-school and college accomplishment quotient and 
Otis’ similar measure, Symonds’ index of effort, Nygaard’s modified 
accomplishment quotient and Rand’s suggested program of recon- 
structing such measures, have received either no use at all, except 
perhaps by their originators, or a comparatively small amount. 

The writer recommends that ‘‘quotient” be used rather than “ratio” 
and, therefore, “achievement quotient” or “accomplishment quotient” 
be applied to achievement or accomplishment age divided by mental 
age; that “subject quotient” and “educational quotient” be employed 
when the divisor is chronological age; that Symonds’ “index of effort” 
be employed where satisfactory quotient measures are not available ; 
and that the other suggested measures be dropped either because 
they require too elaborate computation or because they are not needed. 
The comparison of achievement with mental age is more significant 
than that with chronological age. Franzen’s concept of an ISAO), ht 
100 as the theoretical maximum is erroneous; an A.Q. of this size is 
just average. The validity of most quotient measures is not very high, 
chiefly because the units in the numerator and denominator are not 
equivalent. The proposals for improving this condition appear to be 
too complicated to receive general use. 

A review of all known studies of the reliability of measures of 
achievement relative to capacity leads to the conclusion that their 
reliability is decidedly unsatisfactory. This is supported by original 
data obtained and presented by the writer. Indeed, their reliability 
is so low that it is recommended that they never be employed for the 
diagnosis, classification, or other treatment of individual pupils except 


49 


50 Butietin No. 45 


possibly when they are based upon the combined results from several 
group tests or one individual intelligence test. Relative measures, 
for a class or larger group do, however, possess high enough relia- 
bility to warrant their use. 


APPENDIX A 


THE REDUCTION OF NEGATIVE CORRELATION BETWEEN 
INTELLIGENCE QUOTIENTS AND ACHIEVEMENT 
QUOTIENTS 


As has been shown in the body of- this bulletin’ negative corre- 
lations are almost always found to exist between achievement quotients 
or similar measures and intelligence quotients. In other words, instruc- 
tion is such that pupils of superior capacity are not stimulated to ap- 
proach their maximum achievement as nearly as are pupils who are 
less highly endowed. It has been suggested, and indeed in some cases 
shown, that such negative correlations can be lessened, perhaps even 
reduced to zero; that is, to a figure which indicates that pupils of all 
levels of capacity are capitalizing their potential abilities to an equal 
degree. 

Probably the first writer to call attention to this possibility was 
Franzen. As a part of his study of the accomplishment ratto,-"he 
made an experimental attempt to motivate a rather small group of 
pupils so as to raise their accomplishment ratios to the maximum, 
which he erroneously considered to be 100.° Although Franzen con- 
cluded that it is possible to motivate pupils so that their A.R.’s will 
approach 100 rather closely, Wilson* has shown that this conclusion is 
unjustified. Using Franzen’s own figures, he shows that the negative 
correlation between I.Q.’s and A.R.’s was not significantly changed by 
two years’ stimulation of the pupils. 

Another worker who has discussed this point is Torgerson,°® who 
emphasized the point that “proper grade placement... . tends to 
raise the accomplishment quotient of all pupils to a normal maximal 
efficiency.” In support of this he cited data secured from a study in- 
cluding several hundred pupils. These data showed that as the pupils 
were placed in their grades, those who were retarded had a median 
accomplishment quotient of 100; those normally placed, of 107; and 
those accelerated, of 118. Furthermore, they showed an average ac- 
complishment quotient of 107 for pupils with intelligence quotients 
below 90, of 101 for those with 1.Q.’s from 90 to 109, and of 93 for 


Wee p. 23f. 

2See p. 14. ’ : e a Ce eek 

8The fallacy of Franzen’s contention that the maximum A.R. or A.Q. is 100, is shown 
a P- eitison, W. R. “The Misleading Accomplishment Quotient,” Journal of Educational 


sake , 1928. 
eS coe he Classification by Mental Ages and Intelligence Quotients Worth 


5T, . 
<i ee of Educational Research, 13:171-80, March, 1926. 


51 


52 : Butietin No. 45 


those with I.Q.’s above 110. This evidence agrees very well with 
what others have found. Torgerson, however, gave further data to 
show that for pupils properly graded, the average A.Q.’s for the 
three groups according to intelligence quotients were, respectively, 
108, 109, and 112, or, in other words, that the differences between 
them were very small. He concluded, therefore, “that when pupils 
are properly graded the inverse relationship between intelligence quo- 
tient and accomplishment quotient disappears.” 

Popenoe,® on the other hand, has cited some figures which do not 
support Torgerson’s argument. Twenty-four elementary schools in 
which there were larger than average negative correlations between 
A.Q.’s and I.Q.’s were chosen and attempts made to reduce the nega- 
tive correlation. Subsequent testing, however, indicated that this cor- 
relation, which averaged —.59 at the beginning of the experiment, was 
not materially changed. Despite Popenoe’s findings, however, it ap- 
pears that proper grade placement, satisfactory motivation, and teach- 
ing methods adapted to pupils’ abilities will generally result in reducing 
the negative correlation between intelligence quotients and accomplish- 
ment quotients to practically zero. 


®Popenoe, Herbert. “A Report of Certain Signifi ienci ‘ 
Quotient,” Journal of Educational Reseorgea Te d0a7, fase coo ae of the Accomplishment 


APPENDIX B 


ESTIMATING THE RELIABILITY OF ACHIEVEMENT 
QUOTIENTS FROM THAT OF ACHIEVEMENT 
AND MENTAL AGES 


It sometimes happens that it is convenient to be able to estimate 
the reliability of achievement quotients without actually computing 
measures thereof directly from the quotients themselves. The writer, 
therefore, has attempted to discover a method of doing so when the 
reliabilities of achievement and mental ages are known. So far as he 
was able to learn, no valid formula for this purpose has been devised 
and published. There are, of course, well established formulae for 
measuring the reliability of index numbers and other quotients in 
which all of a series have the same denominator, and also for certain 
other expressions somewhat similar to the A.Q. A reasonably diligent 
and exhaustive search, however, failed to reveal any formula entirely 
appropriate to the purpose under discussion. The writer did, however, 
discover two methods, one somewhat better than the other, by which 
the reliability of achievement quotients may be estimated approximate- 
ly when those of achievement and mental ages are known. Both of these 
deal. with probable or standard errors. Nothing based upon coeff- 
cients of reliability was found. 

One of these two methods involves the use of a standard formula 
for the error of a quotient.1 This formula is as follows, expressed 
in symbols ordinarily used in educational work: 


\ (2-22) + P.ES 


PEP Y 
= Y 


¥ 


Thus if, in the case of a particular pupil, the probable error of his 
achievement age (X), and also that of his mental age (Y), are known, 


it is possible to compute that of his achievement quotient z Parts 


formula would require a separate computation and yield a different 
probable error for each pupil except in the case of two or more whose 
achievement ages and mental ages were the same. However, this is 
not quite what is desired in ordinary work with age and quotient 


1 1 be found in: . : 
Tee Ww Sener Mathematics for Students of Chemistry and Physics. London: 


Longmans, Green and Company, 1909, p. 529. 
53 


54 . Butietin No. 45 


TABLE VI. COMPARISON OF PROBABLE ERRORS OF ACHIEVEMENT QUOTIENTS 


. 2 
ie ee) PE! 
COMPUTED BY THE FORMULA P.E., = Y 


y WE 
WITH THOSE COMPUTED FROM THE QUOTIENTS THEMSELVES 


P.E. meas. P.E. meus. P.E. meas. 
in months Vig i 
By From By From By From 
Formula Quotients Formula Quotients Formula Quotients 
Stanford Arithmetic... 4.8 4.6 -05 05 44 -45 
Monroe Arithmetic... 9.6 9.6 .08 .08 os) -47 
Monroe Reading Ps 
Comprehension... 8.9 9.2 .09 .09 .54 : 

Monroe Reading Rate. 12.5 13.0 «il BE | .54 44 


scores. The desideratum is rather a formula which may be solved 
just once to yield a single probable error that applies to the quotient 
scores of the whole group of pupils in question. 

Apparently the most satisfactory way of attempting to procure a 
measure of the kind mentioned from this formula is to substitute in 
it the average achievement and mental ages of the group rather than 
those of an individual pupil, and thus secure a probable error appli- 
cable to the achievement quotients of the whole group. This has been 
done by the writer and the results given in Table VI. The first pair 
of columns in this table compare the probable errors of measurement 
by the formula with those actually computed from the quotients.? The 
second pair of columns does the same for the ratios of the probable 
errors of measurement to the means and the third for their ratios to 
the standard deviations. In applying the formula in the cases of both 
these ratios, it is slightly modified by dropping the denominator Y. 
This is necessary because the probable errors have already been ex- 
pressed in terms of the means and standard deviations respectively 
and, therefore, should not again be divided by the mean. It will be 
seen that in the first pair of columns the probable errors given by the 
formula approach those actually computed rather closely. The largest 
difference, which occurs in the case of Monroe Reading Rate, is only 
about 4 per cent of the size of the P.E. In the case of the ratios of the 
probable errors to the means, the agreement is exact to two decimal 
places. In the case of their ratios to the standard deviations, the 
differences are somewhat greater, running up to almost one-fourth of 


*The latter have already been given in Table V, but are repeated here. 


A Stupy or Measures OF ACHIEVEMENT RELATIVE TO CAPACITY 55 


the ratios themselves. It appears from these comparisons that, in 
the case of the data dealt with, the application of the formula given 
above to mean scores yields probable errors of measurement and 
ratios of such errors to the means accurate enough for all practical 
purposes. This statement can not be made, however, for the ratios 
of the probable errors to the standard deviations. 

It is perhaps needless to say that the present study is too limited in 
its scope for these results to be taken as definite proof that the values 
yielded by the formula will always approach those actually obtained 
from the quotients closely as in this case. On the other hand, the 
evidence that this is true is sufficient to carry considerable weight and 
to justify one in proceeding tentatively on this basis until further 
data bearing on the point have been collected and published. 

The second of the methods which the writer found to give ap- 
proximations to the actual probable errors is that used in the case of 
sums and differences. The standard formula for the probable error 
of a sum or difference on the assumption that the quantities composing 


it are uncorrelated is as follows®: P.E.. , = \/P.E2+ P.E2 MIS 
occurred to the writer that the probable error of a quotient might 
be of somewhat the same size as that of a difference; therefore he 
experimented with this formula. If the probable errors of measure- 
ment of the achievement and mental ages are substituted therein, the 
results are not at all similar to those actually obtained for achieve- 
ment quotients. The reason for this is easy to see; the ages are ex- 
pressed in terms of an entirely different unit from that used in the 
quotients. In the case of the formula discussed above, this situation 
was taken care of by dividing by the denominator, but in the formula 
just given, no such division occurs, or, in other words, nothing is done 
to change the unit employed from that of the numerator and the 
denominator to that used in the quotient. This lack of the same units 
may be taken care of, however, by dividing the probable errors of 
measurement by their means or standard deviations. The writer, 
therefore, tried out the formula for the probable error of the dif- 
ference by substituting in it the ratios of the probable errors of the 
achievement and mental ages, respectively, to their means and standard 
The results along with the actually obtained similat ratios 
previously given in Table V are shown in Table VII. The first pair of 
columns in the table compares the results by the two methods for the 
ratios of the probable errors of measurement to the means and the 


deviations. 


3 Ad: d from: ak 
ace c ‘An Introduction to the Theory of Statistics. London: 
t 


and Company, 1919, p. 211. 


Charles Griffin 


e 


56 Butietin No. 45 


TABLE VII. ComPpARISON OF RATIOS OF PROBABLE ERRORS OF MEASUREMENT 
TO THE MEAN AND STANDARD DEVIATION COMPUTED BY THE FORMULA FOR 
THE PROBABLE ERROR OF A DIFFERENCE WITH THOSE OBTAINED 
FROM THE ACTUAL ACHIEVEMENT QUOTIENTS 


P.E. meas. PG rnena 
M o 
By From By From 
Formula Quotients Formula Quotients 
Stanford Amithmeticnen sire stele ce cis .05 .05 44 .45 
Monroe Arithmetic. cic cee cere ews -08 .08 .50 «47 
Monroe Reading Comprehension...... .09 .09 ToS: .53 
Monroe Reading Rate... 0.005 ick ce aes .10 at .50 44 


second pair of columns for the ratios to the standard deviations. As 
will be seen, the entries in the first pair carried to the second deci- 
mal place differ in only one case out of the four, and this difference 
is as small as possible, being only .01. The differences in the case 
of the entries in the second pair of columns are somewhat greater, 
though not very large. On the whole it appears that the formula for 
the error of a difference gives an approximately correct error for a 
quotient also when the ratios of probable errors to means are em- 
ployed, and a somewhat less satisfactory one when their ratios to the 
standard deviations are used. Probably it should not be applied in 
connection with the latter, although the errors involved in doing so 
are not great. There is little choice between the two methods of ap- 
proximation, the first being slightly more accurate for the ratio of 


the probable error to the mean, and the second for its ratio to the 
standard deviation. 


APPENDIX C 


A COMPARISON OF THE RELIABILITY OF QUOTIENT 
MEASURES AND DIFFERENCE MEASURES 


From the standpoint of reliability, there appears to be little choice 
between difference measures and quotient measures. Foran? is among 
those making the statement that there is little difference between the 
two kinds of measures with regard to reliability, but neither he nor 
anyone else, in so far as the writer knows, has cited suitable data to 
prove this contention. From a logical standpoint, it seems reasonable 
that if measures of achievement and of capacity possess given 
amounts of unreliability which tend to be cumulative for measures 
of achievement relative to capacity, the total unreliability would be 
about the same for either difference or quotient measures. In order 
to investigate the truth of this assumption, however, the writer found 
the probable errors of measurement for both difference and quotient 
measures computed from the same test scores. The results are given 
in Table VIII. The first half of this table contains the coefficients 
of correlation and of alienation, the probable errors of measurement 
and the ratios of these errors to the means and to the standard devia- 
tions for differences between achievement scores and _ intelligence 
scores. The second half of the table contains the same items for the 
quotient’ measures derived from the same test scores as furnished 
the basis for the difference measures.” 

It will be seen from the figures in this table that there is a slight 
tendency for the coefficients of correlation of the difference measures 
to be larger and the corresponding coefficients of alienation smaller, 
than those of the quotient measures, but that this tendency is not 
strong enough to be significant. The probable errors of measurement 
of the difference measures are distinctly greater than those of the 
quotient measures, but their ratios to the means and standard devia- 
tions are in most cases slightly less than the corresponding ratios 
for the quotient measures. On the whole, although these data show 
that the reliability of the difference measures is greater than that of 
the quotient measures, the difference is so small as to justify the 
assumption that there is little difference in the reliability of difference 


and quotient scores. 


‘4 _G. “The Meaning and Limitations of Scores, Norms, and Standards in Edu- 
ye ecarerent,” Catholic Univers of America Educational Research Bulletin, Vol. 3, 
Catholic Education Press, February, 1928, p. 16-19, 23-26. 


Nee eG the second half of the table is taken from Table V on p. 45. 
57 


i OFT 
Roa EG 
80° 9°6 


so° ay, 
= W ; “aveur-ny+ 
mera | ad 


; . somsvoyy quarjone 


13" 
£6° 


OF’ (0) i 8°17 wa +9" 
ss° 60° Tak 6° te" 
oF 10° 6°9T $3" tS" 
th" $0" WAS (oe LSe 
2 W see 

i a | “euour- Taal d zi 4 3 


soImseapf WUeIEyIG 


SAXONS ISA] ANVS AHL WOU AAATYAC] SAAASVAJ INAILON() GNV SaMASVAY 


HONAVAIIIC, AO ALIMAVITAY AO NOSIAVAWOD “[[[A ATAV 


sees “+ *97ey SUIPeoYy so1uO/y 


kG ae a uorsusyaidu0;d 
SUIPEaY VOIUOPT 
Sodgoene +> + -onamyyy somo, 


trees sees so QmIUIIW PlojuezyS 


